Tensorflow返回Nan应该是简单的计算

import numpy as np import tensorflow as tf # Model parameters W = tf.Variable([.3], dtype=tf.float32) b = tf.Variable([-.3], dtype=tf.float32) # Model input and output x = tf.placeholder(tf.float32) linear_model = W * x + b y = tf.placeholder(tf.float32) # loss loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares # optimizer optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) # training data x_train = [1,2,3,4] y_train = [0,-1,-2,-3] # training loop init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) # reset values to wrong for i in range(1000): sess.run(train, {x:x_train, y:y_train}) # evaluate training accuracy curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

import numpy as np import tensorflow as tf # Model parameters W = tf.Variable([.3], dtype=tf.float32) b = tf.Variable([-.3], dtype=tf.float32) # Model input and output x = tf.placeholder(tf.float32) linear_model = W * x + b y = tf.placeholder(tf.float32) # loss loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares # optimizer optimizer = tf.train.GradientDescentOptimizer(0.01) train = optimizer.minimize(loss) # training data x_train = [1,2,3,4,5,6,7] y_train = [0,-1,-2,-3,-4,-5,-6] # training loop init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) # reset values to wrong for i in range(1000): sess.run(train, {x:x_train, y:y_train}) # evaluate training accuracy curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

2017-07-22 22:23:13.129983: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.130125: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.130853: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.130986: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.131126: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.131234: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.132178: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-07-22 22:23:13.132874: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. W: [ nan] b: [ nan] loss: nan

1条回答

网友

1楼 · 发布于 2024-04-26 18:11:26

欢迎来到超参数调谐的奇妙世界。您可以尝试以下方法，首先，您可以在for循环中打印一些输出，而不是在末尾提供一些输出，然后可以变成：

for i in range(1000):
    curr_W, curr_b, curr_loss,_ = sess.run([W, b, loss, train], {x:x_train, y:y_train})
    print("Iteration %d W: %s b: %s loss: %s"%(i, curr_W, curr_b, curr_loss))

如果运行此命令，则输出如下所示：

^{pr2}$

此时，你应该能看到W和b的值正在积极地更新，而不是减少，你的损失实际上在增加，并且非常快地接近无穷大。这反过来又意味着你的学习率有很大差距。如果将学习率除以10并将其设置为0.001，则最终结果为：

W: [-0.97952145] b: [ 0.8985914] loss: 0.0144026

这就意味着你的模型还没有收敛（再看看以前的输出，理想情况下你可以画出一个损失图。将学习率设置为0.05的下一个实验给出：

W: [-0.99999958] b: [ 0.99999791] loss: 6.48015e-12

因此得出结论：

试着把中间结果从sess.运行（）（或某些张量的eval（）），看看模型是如何学习的。在
超参数调整乐趣和利润。在

注意：现在你仍然在使用固定学习率的“简单”梯度下降，但也有自动调整学习率的优化器。优化器（及其参数）的选择也是其他超参数。在

相关问题更多 >

编程相关推荐

热门问题

热门文章