我试着通过复习他们的教程来学习tensorflow,并做一些小的修改。我遇到了一个错误,对代码进行微小的更改会导致输出变为nan。在
他们最初的代码是:
import numpy as np
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4]
y_train = [0,-1,-2,-3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
sess.run(train, {x:x_train, y:y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
其输出为:
^{pr2}$请注意每次运行时收到的所有消息,因为我使用pip安装,而不是自己编译的。但是,它确实得到了正确的输出,W=-1和b=1
我将代码修改为这样,只添加了x\u train和y\u train变量:
import numpy as np
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4,5,6,7]
y_train = [0,-1,-2,-3,-4,-5,-6]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
sess.run(train, {x:x_train, y:y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
这是这个新代码的输出:
2017-07-22 22:23:13.129983: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130125: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130853: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130986: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.131126: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.131234: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.132178: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.132874: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
W: [ nan] b: [ nan] loss: nan
我真的不知道为什么扩展培训数据会导致这种情况的发生。我有什么遗漏吗?在
另外,我完全不确定如何调试TF中的东西,比如在循环中不断地修改变量时,让值以增量方式打印。只是把变量打印出来似乎行不通。我想知道,这样以后我就可以自己调试这些东西了!在
欢迎来到超参数调谐的奇妙世界。您可以尝试以下方法,首先,您可以在for循环中打印一些输出,而不是在末尾提供一些输出,然后可以变成:
如果运行此命令,则输出如下所示:
^{pr2}$此时,你应该能看到W和b的值正在积极地更新,而不是减少,你的损失实际上在增加,并且非常快地接近无穷大。这反过来又意味着你的学习率有很大差距。如果将学习率除以10并将其设置为0.001,则最终结果为:
这就意味着你的模型还没有收敛(再看看以前的输出,理想情况下你可以画出一个损失图。将学习率设置为0.05的下一个实验给出:
因此得出结论:
注意:现在你仍然在使用固定学习率的“简单”梯度下降,但也有自动调整学习率的优化器。优化器(及其参数)的选择也是其他超参数。在
相关问题 更多 >
编程相关推荐