如何将RNN输出反馈到输入以十进制（十位数）

tf.reset_default_graph() n_samples=100 state_size=5 lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.) def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None] zero_x = np.zeros(n_samples)[None, :, None] X = tf.placeholder_with_default(zero_x, [None, n_samples, 1]) output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64) pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh) Y = np.roll(def_x, 1) loss = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples) opt = tf.train.AdamOptimizer().minimize(loss) sess = tf.InteractiveSession() tf.global_variables_initializer().run() # Initial state run plt.show(plt.plot(output.eval()[0])) plt.plot(def_x.squeeze()) plt.show(plt.plot(pred.eval().squeeze())) steps = 1001 for i in range(steps): p, l, _= sess.run([pred, loss, opt])

with tf.variable_scope('sine', reuse=True): X_test = tf.placeholder(tf.float64) X_reshaped = tf.reshape(X_test, [1, -1, 1]) output, last_states = tf.nn.dynamic_rnn(lstm_cell, X_reshaped, dtype=tf.float64) pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh) test_vals = [0.] for i in range(1000): val = pred.eval({X_test:np.array(test_vals)[None, :, None]}) test_vals.append(val)

3条回答

网友

1楼 · 编辑于 2024-05-23 20:15:38

如果我理解正确的话，你想找到一种方法把时间步的输出作为时间步的输入，对吧？为此，您可以在测试时间使用一个相对简单的工作：

确保输入占位符可以接受动态序列长度，即时间维度的大小为None。
确保您使用的是tf.nn.dynamic_rnn（您在发布的示例中这样做）。
将初始状态传递到dynamic_rnn。
然后，在测试时，您可以循环遍历您的序列，并分别输入每个时间步（即最大序列长度为1）。另外，您只需携带RNN的内部状态。请参阅下面的伪代码（变量名指的是您的代码片段）。

也就是说，将模型的定义更改为如下内容：

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(state_size, forget_bias=1.)
X = tf.placeholder_with_default(zero_x, [None, None, 1])  # [batch_size, seq_length, dimension of input]
batch_size = tf.shape(self.input_)[0]
initial_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)
def_x = np.sin(np.linspace(0, 10, n_samples))[None, :, None]
zero_x = np.zeros(n_samples)[None, :, None]
output, last_states = tf.nn.dynamic_rnn(inputs=X, cell=lstm_cell, dtype=tf.float64,
    initial_state=initial_state)
pred = tf.contrib.layers.fully_connected(output, 1, activation_fn=tf.tanh)

然后你可以这样做推断：

fetches = {'final_state': last_state,
           'prediction': pred}

toy_initial_input = np.array([[[1]]])  # put suitable data here
seq_length = 20  # put whatever is reasonable here for you

# get the output for the first time step
feed_dict = {X: toy_initial_input}
eval_out = sess.run(fetches, feed_dict)
outputs = [eval_out['prediction']]
next_state = eval_out['final_state']

for i in range(1, seq_length):
    feed_dict = {X: outputs[-1],
                 initial_state: next_state}
    eval_out = sess.run(fetches, feed_dict)
    outputs.append(eval_out['prediction'])
    next_state = eval_out['final_state']

# outputs now contains the sequence you want

注意，这也适用于批处理，但是如果在同一批处理中使用不同长度的序列，则可能会更加复杂。

如果你不仅想在测试时，而且想在训练时进行这种预测，这也是可能的，但是实现起来要复杂一些。

网友

2楼 · 编辑于 2024-05-23 20:15:38

我知道我去派对有点晚了，但我想这个要点可能有用：

https://gist.github.com/CharlieCodex/f494b27698157ec9a802bc231d8dcf31

它允许您通过过滤器自动输入，然后作为输入返回到网络。要使形状匹配，可以将processing设置为tf.layers.Dense层。

请问任何问题！

编辑：

在您的特定情况下，创建一个lambda，它执行将dynamic_rnn输出处理到您的字符向量空间。例如：

# if you have:
W = tf.Variable( ... )
B = tf.Variable( ... )
Yo, Ho = tf.nn.dynamic_rnn( cell , inputs , state )
logits = tf.matmul(W, Yo) + B
 ...
# use self_feeding_rnn as
process_yo = lambda Yo: tf.matmul(W, Yo) + B
Yo, Ho = self_feeding_rnn( cell, seed, initial_state, processing=process_yo)

网友

3楼 · 编辑于 2024-05-23 20:15:38

您可以使用它自己的输出（最后状态）作为下一步输入（初始状态）。一种方法是：

在每个时间步使用零初始化变量作为输入状态
每次你完成一个被截断的序列并得到一些输出状态时，用你刚刚得到的输出状态更新状态变量。

第二种方法可以是：

将状态提取到python并在下次将其反馈给python，如ptb example in tensorflow/models
如ptb example in tensorpack中所做的那样，在图中构建一个update op并添加一个依赖项。

相关问题更多 >

编程相关推荐

热门问题

热门文章