在Tensorflow中用神经网络实现XOR门的问题
我想做一个简单的神经网络,目标是实现一个异或(XOR)门。我在用Python的TensorFlow库。对于异或门,我训练的数据就是完整的真值表,这应该就够了吧?我预计很快就会出现过度优化的问题。代码的问题在于,权重和偏置没有更新。不知道为什么,尽管权重和偏置都是零,结果却还是给我100%的准确率。
x = tf.placeholder("float", [None, 2])
W = tf.Variable(tf.zeros([2,2]))
b = tf.Variable(tf.zeros([2]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,1])
print "Done init"
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.75).minimize(cross_entropy)
print "Done loading vars"
init = tf.initialize_all_variables()
print "Done: Initializing variables"
sess = tf.Session()
sess.run(init)
print "Done: Session started"
xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])
acc=0.0
while acc<0.85:
for i in range(500):
sess.run(train_step, feed_dict={x: xTrain, y_: yTrain})
print b.eval(sess)
print W.eval(sess)
print "Done training"
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print "Result:"
acc= sess.run(accuracy, feed_dict={x: xTrain, y_: yTrain})
print acc
B0 = b.eval(sess)[0]
B1 = b.eval(sess)[1]
W00 = W.eval(sess)[0][0]
W01 = W.eval(sess)[0][1]
W10 = W.eval(sess)[1][0]
W11 = W.eval(sess)[1][1]
for A,B in product([0,1],[0,1]):
top = W00*A + W01*A + B0
bottom = W10*B + W11*B + B1
print "A:",A," B:",B
# print "Top",top," Bottom: ", bottom
print "Sum:",top+bottom
我在跟着这个教程:http://tensorflow.org/tutorials/mnist/beginners/index.md#softmax_regressions,在最后的循环中,我打印了矩阵的结果(链接中有描述)。
有没有人能指出我的错误,以及我该怎么修复它?
相关文章:
- 暂无相关问题
1 个回答
22
你的程序有几个问题。
第一个问题是,你正在学习的函数其实不是XOR(异或),而是NOR(非或)。这几行:
xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])
...应该改成:
xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[0], [1], [1], [0]])
下一个大问题是,你设计的网络无法学习XOR。你需要使用一个非线性函数(比如tf.nn.relu()
),并且至少再定义一层,才能学习XOR函数。例如:
x = tf.placeholder("float", [None, 2])
W_hidden = tf.Variable(...)
b_hidden = tf.Variable(...)
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)
W_logits = tf.Variable(...)
b_logits = tf.Variable(...)
logits = tf.matmul(hidden, W_logits) + b_logits
还有一个问题是,把权重初始化为零会阻止你的网络进行训练。通常,你应该随机初始化权重,而把偏置初始化为零。这里有一种常见的方法:
HIDDEN_NODES = 2
W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))
W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))
把这些都结合起来,并使用TensorFlow的交叉熵函数(为了方便,yTrain
用了一种热编码),下面是一个能学习XOR的程序:
import math
import tensorflow as tf
import numpy as np
HIDDEN_NODES = 10
x = tf.placeholder(tf.float32, [None, 2])
W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)
W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))
logits = tf.matmul(hidden, W_logits) + b_logits
y = tf.nn.softmax(logits)
y_input = tf.placeholder(tf.float32, [None, 2])
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, y_input)
loss = tf.reduce_mean(cross_entropy)
train_op = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1, 0], [0, 1], [0, 1], [1, 0]])
for i in xrange(500):
_, loss_val = sess.run([train_op, loss], feed_dict={x: xTrain, y_input: yTrain})
if i % 10 == 0:
print "Step:", i, "Current loss:", loss_val
for x_input in [[0, 0], [0, 1], [1, 0], [1, 1]]:
print x_input, sess.run(y, feed_dict={x: [x_input]})
请注意,这可能不是计算XOR的最有效的神经网络,所以欢迎对参数进行调整的建议!