The computation proceeds as follows. An LSTM layer has an
input-to-state component and a recurrent state-to-state component that
together determine the four gates inside the LSTM core. To enhance
parallelization in the Row LSTM the input-to-state component is first
computed for the entire two-dimensional input map; for this a k × 1
convolution is used to follow the row-wise orientation of the LSTM
itself. The convolution is masked to include only the valid context
(see Section 3.4) and produces a tensor of size 4h × n × n,
representing the four gate vectors for each position in the input map,
where h is the number of output feature maps. To compute one step of
the state-to-state component of the LSTM layer, one is given the
previous hidden and cell states hi−1 and ci−1, each of size h × n × 1.
The new hidden and cell states hi , ci are obtained as follows:
where xi of size h × n × 1 is row i of the input map, and ~ represents the convolution operation and the elementwise
multiplication. The weights Kss and Kis are the kernel weights for the
state-to-state and the input-to-state components, where the latter is
precomputed as described above. In the case of the output, forget and
input gates oi , fi and ii , the activation σ is the logistic sigmoid
function, whereas for the content gate gi , σ is the tanh function.
Each step computes at once the new state for an entire row of the
input map
摘要
以下是正在进行的部分实施:
https://github.com/carpedm20/pixel-rnn-tensorflow
下面是谷歌在Bidliagm上的描述:
https://towardsdatascience.com/summary-of-pixelrnn-by-google-deepmind-7-min-read-938d9871d6d9
行LSTM
来自链接的deepmind博客:
一个像素的隐藏状态,在下面的图像中是红色的,是基于它前面的三个三角形像素的“内存”。因为它们在一行中,我们可以并行计算,加快计算速度。我们牺牲一些上下文信息(使用更多的历史或内存)来实现这种并行计算和加速训练。在
实际的实现依赖于其他几个优化,并且非常复杂。从original paper:
对角线BLSTM的开发是为了在不牺牲大量上下文信息的情况下利用并行化的加速。DBLSTM中的一个节点向它的左边和上面看;因为这些节点也向左边和上面看,所以给定节点的条件概率在某种意义上取决于它的所有祖先。否则,架构非常相似。来自deepmind博客:
相关问题 更多 >
编程相关推荐