无法让TensorFlow做点什么trivi

2024-04-20 08:57:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个向量x,想要计算一个向量y,这样y[j] = x[j]**2就可以使用张量流指定的神经网络,如下所示。工作不太好,误差很大。
我做错什么了吗?
任何帮助都将不胜感激

它的工作方式是首先在Xtrain、Ytrain、Xtest和Ytest中生成数据,然后创建占位符变量以使TensorFlow运行。
然后指定三个隐藏层和一个输出层。然后它进行训练,并使用feed字典创建Ytest的预测Ypred。你知道吗

import numpy as np
import tensorflow as tf

n = 10
k = 1000
n_hidden = 10
learning_rate = .01
training_epochs = 100000

Xtrain = []
Ytrain = []
Xtest = []
Ytest = []

for i in range(0,k,1):
    X = np.random.randn(1,n)[0]
    Xtrain += [X]
    Ytrain += [Xtrain[-1]**2]
    X = np.random.randn(1,n)[0]
    Xtest += [X]
    Ytest += [Xtest[-1]**2]

x = tf.placeholder(tf.float64,shape = (k,n))
y = tf.placeholder(tf.float64,shape = (k,n))

W1 = tf.Variable(tf.random_normal((n,n_hidden),dtype = tf.float64))
b1 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
x_hidden1 = tf.nn.sigmoid(tf.matmul(x,W1) + b1)

W2 = tf.Variable(tf.random_normal((n,n_hidden),dtype = tf.float64))
b2 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
x_hidden2 = tf.nn.sigmoid(tf.matmul(x_hidden1,W2) + b2)

W3 = tf.Variable(tf.random_normal((n,n_hidden),dtype = tf.float64))
b3 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
x_hidden3 = tf.nn.sigmoid(tf.matmul(x_hidden1,W3) + b3)

W4 = tf.Variable(tf.random_normal((n,n_hidden),dtype = tf.float64))
b4 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
y_pred = tf.matmul(x_hidden3,W4) + b4

penalty = tf.reduce_sum(tf.abs((y - y_pred)))
train_op = tf.train.AdamOptimizer(learning_rate).minimize(penalty)

model = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(model)
    for i in range(0,training_epochs):
        sess.run(train_op,{x: Xtrain,y: Ytrain})

    Ypred = y_pred.eval(feed_dict = {x: Xtest})

Tags: tfasnprandomvariablehiddennormaldtype
2条回答

这个代码做得更好。有人想做进一步的改进吗?你知道吗

import numpy as np
import tensorflow as tf

n = 10
k = 1000
n_hidden = 50
learning_rate = .001
training_epochs = 100000

Xtrain = []
Ytrain = []
Xtest = []
Ytest = []

for i in range(0,k,1):
    X = np.random.randn(1,n)[0]
    Xtrain += [X]
    Ytrain += [Xtrain[-1]**2]
    X = np.random.randn(1,n)[0]
    Xtest += [X]
    Ytest += [Xtest[-1]**2]

x = tf.placeholder(tf.float64,shape = (k,n))
y = tf.placeholder(tf.float64,shape = (k,n))

W1 = tf.Variable(tf.random_normal((n,n_hidden),dtype = tf.float64))
b1 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
x_hidden1 = tf.nn.sigmoid(tf.matmul(x,W1) + b1)

W2 = tf.Variable(tf.random_normal((n_hidden,n_hidden),dtype = tf.float64))
b2 = tf.Variable(tf.random_normal((n_hidden,),dtype = tf.float64))
x_hidden2 = tf.nn.sigmoid(tf.matmul(x_hidden1,W2) + b2)

W3 = tf.Variable(tf.random_normal((n_hidden,n),dtype = tf.float64))
b3 = tf.Variable(tf.random_normal((n,),dtype = tf.float64))
y_pred = tf.matmul(x_hidden2,W3) + b3

penalty = tf.reduce_sum((y - y_pred)**2)
train_op = tf.train.AdamOptimizer(learning_rate).minimize(penalty)

model = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(model)
    for i in range(0,training_epochs):
        sess.run(train_op,{x: Xtrain,y: Ytrain})

    Ypred = y_pred.eval(feed_dict = {x: Xtest})

下面是对代码的一些简单修改:

import numpy as np
import tensorflow as tf

n = 10
k = 1000
learning_rate = 1e-3
training_epochs = 100000

# It will be better for you to use PEP8 style

# None here will allow you to feed data with ANY k size
x = tf.placeholder(tf.float64, shape=(None, n))
y = tf.placeholder(tf.float64, shape=(None, n))

# Use default layer constructors
# from your implementation it uses another random initializer
out = tf.layers.dense(x, 100)
out = tf.layers.batch_normalization(out)
# ReLU is better than sigmoid, there are a lot of articles about it
out = tf.nn.relu(out)

out = tf.layers.dense(out, 200)
out = tf.layers.batch_normalization(out)
out = tf.nn.relu(out)

out = tf.layers.dense(out, n)

# total loss = mean L1 for samples
# each sample is a vector of 10 values, so you need to calculate
# sum along first axis, and them calculate mean of sums
l1 = tf.reduce_mean(tf.reduce_sum(tf.abs(y - out), axis=1))
train_op = tf.train.AdamOptimizer(learning_rate).minimize(l1)

model = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(model)
    for i in range(training_epochs):
        xs = np.random.randn(k, n)
        ys = xs ** 2
        _, l1_value = sess.run(
            [train_op, l1],
            feed_dict={x: xs, y: ys})
        if (i + 1) % 10000 == 0 or i == 0:
            print('Current l1({}/{}) = {}'.format(
                i + 1, training_epochs, l1_value))

    xs = np.random.randn(k, n)
    ys = xs ** 2
    test_l1 = sess.run(l1, feed_dict={x: xs, y: ys})
    print ('Total l1 at test = {}'.format(test_l1))

输出:

Current l1(1/100000) = 11.0853215657
Current l1(10000/100000) = 0.126037403282
Current l1(20000/100000) = 0.096445475666
Current l1(30000/100000) = 0.0719392853473
Current l1(40000/100000) = 0.0690671103719
Current l1(50000/100000) = 0.07661241544
Current l1(60000/100000) = 0.0743827124406
Current l1(70000/100000) = 0.0656016587469
Current l1(80000/100000) = 0.0675546809828
Current l1(90000/100000) = 0.0649035400487
Current l1(100000/100000) = 0.0583308788607
Total l1 at test = 0.0613149096968

总惩罚可以通过其他一些架构、学习率、批量大小、历元计数、损失函数、e.t.c.来提高

看来架构可能会增加,那么你就可以长时间运行训练,获得1e-3。你知道吗

有关它的工作原理和操作方法的更多信息,请访问CS231 course。你知道吗

另外,这里有一些关于数据输入的假设:我测试的一些数据可能是在训练过程中得到的。因为任务很简单,所以没关系,但最好保证测试集中不会有任何列车样本。你知道吗

相关问题 更多 >