Tensorflow返回Nan应该是简单的计算

2024-04-26 18:11:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着通过复习他们的教程来学习tensorflow,并做一些小的修改。我遇到了一个错误,对代码进行微小的更改会导致输出变为nan。在

他们最初的代码是:

import numpy as np
import tensorflow as tf

# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4]
y_train = [0,-1,-2,-3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
  sess.run(train, {x:x_train, y:y_train})

# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

其输出为:

^{pr2}$

请注意每次运行时收到的所有消息,因为我使用pip安装,而不是自己编译的。但是,它确实得到了正确的输出,W=-1和b=1

我将代码修改为这样,只添加了x\u train和y\u train变量:

import numpy as np
import tensorflow as tf

# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1,2,3,4,5,6,7]
y_train = [0,-1,-2,-3,-4,-5,-6]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
  sess.run(train, {x:x_train, y:y_train})

# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

这是这个新代码的输出:

2017-07-22 22:23:13.129983: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130125: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130853: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.130986: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.131126: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.131234: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.132178: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-22 22:23:13.132874: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
W: [ nan] b: [ nan] loss: nan

我真的不知道为什么扩展培训数据会导致这种情况的发生。我有什么遗漏吗?在

另外,我完全不确定如何调试TF中的东西,比如在循环中不断地修改变量时,让值以增量方式打印。只是把变量打印出来似乎行不通。我想知道,这样以后我就可以自己调试这些东西了!在


Tags: andtohomereleasewindowstftensorflowtrain
1条回答
网友
1楼 · 发布于 2024-04-26 18:11:26

欢迎来到超参数调谐的奇妙世界。您可以尝试以下方法,首先,您可以在for循环中打印一些输出,而不是在末尾提供一些输出,然后可以变成:

for i in range(1000):
    curr_W, curr_b, curr_loss,_ = sess.run([W, b, loss, train], {x:x_train, y:y_train})
    print("Iteration %d W: %s b: %s loss: %s"%(i, curr_W, curr_b, curr_loss))

如果运行此命令,则输出如下所示:

^{pr2}$

此时,你应该能看到W和b的值正在积极地更新,而不是减少,你的损失实际上在增加,并且非常快地接近无穷大。这反过来又意味着你的学习率有很大差距。如果将学习率除以10并将其设置为0.001,则最终结果为:

W: [-0.97952145] b: [ 0.8985914] loss: 0.0144026

这就意味着你的模型还没有收敛(再看看以前的输出,理想情况下你可以画出一个损失图。将学习率设置为0.05的下一个实验给出:

W: [-0.99999958] b: [ 0.99999791] loss: 6.48015e-12

因此得出结论:

  • 试着把中间结果从sess.运行()(或某些张量的eval()),看看模型是如何学习的。在
  • 超参数调整乐趣和利润。在

注意:现在你仍然在使用固定学习率的“简单”梯度下降,但也有自动调整学习率的优化器。优化器(及其参数)的选择也是其他超参数。在

相关问题 更多 >