LSTM自动编码器问题的回答

LSTM自动编码器

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我试图构建一个LSTM自动编码器，目的是从一个序列中获取一个固定大小的向量，它尽可能地表示序列。该自动编码器由两部分组成： <ul> <li><code>LSTM</code>编码器：获取序列并返回输出向量（<code>return_sequences = False</code>）</li> <li><code>LSTM</code>解码器：获取输出向量并返回序列（<code>return_sequences = True</code>）</li> </ul> 因此，最后，编码器是多对一LSTM，解码器是一对多LSTM。 <a href="https://i.stack.imgur.com/kwhAP.jpg" rel="noreferrer"><img src="https://i.stack.imgur.com/kwhAP.jpg" alt="enter image description here"/></a> 图像源：<a href="http://karpathy.github.io/2015/05/21/rnn-effectiveness/" rel="noreferrer">Andrej Karpathy</a> 在较高的级别上，代码如下所示（类似于<a href="https://github.com/fchollet/keras/issues/5138" rel="noreferrer">here</a>）： <pre><code>encoder = Model(...) decoder = Model(...) autoencoder = Model(encoder.inputs, decoder(encoder(encoder.inputs))) autoencoder.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) autoencoder.fit(data, data, batch_size=100, epochs=1500) </code></pre> 数组<code>data</code>的形状（训练示例数、序列长度、输入维度）是<code>(1200, 10, 5)</code>，如下所示： <pre><code>array([[[1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [0, 0, 1, 0, 0], ..., [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]], ... ] </code></pre> 问题：我不确定如何继续，尤其是如何将<code>LSTM</code>集成到<code>Model</code>以及如何让解码器从向量生成序列。 我正在使用<code>keras</code>和<code>tensorflow</code>后端。 编辑：如果有人想试试，下面是我用移动序列（包括填充）生成随机序列的过程： <pre><code>import random import math def getNotSoRandomList(x): rlen = 8 rlist = [0 for x in range(rlen)] if x <= 7: rlist[x] = 1 return rlist sequence = [[getNotSoRandomList(x) for x in range(round(random.uniform(0, 10)))] for y in range(5000)] ### Padding afterwards from keras.preprocessing import sequence as seq data = seq.pad_sequences( sequences = sequence, padding='post', maxlen=None, truncating='post', value=0. ) </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

下面是一个例子 让我们创建一个由几个序列组成的合成数据。这个想法是通过自动编码器的镜头来观察这些序列。换言之，降低维度或将其汇总为固定长度。 <pre><code># define input sequence sequence = np.array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8], [0.2, 0.4, 0.6, 0.8], [0.3, 0.6, 0.9, 1.2]]) # prepare to normalize x = pd.DataFrame(sequence.tolist()).T.values scaler = preprocessing.StandardScaler() x_scaled = scaler.fit_transform(x) sequence_normalized = [col[~np.isnan(col)] for col in x_scaled.T] # make sure to use dtype='float32' in padding otherwise with floating points sequence = pad_sequences(sequence, padding='post', dtype='float32') # reshape input into [samples, timesteps, features] n_obs = len(sequence) n_in = 9 sequence = sequence.reshape((n_obs, n_in, 1)) </code></pre> 让我们设计一个简单的LSTM <pre><code>#define encoder visible = Input(shape=(n_in, 1)) encoder = LSTM(2, activation='relu')(visible) # define reconstruct decoder decoder1 = RepeatVector(n_in)(encoder) decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1) decoder1 = TimeDistributed(Dense(1))(decoder1) # tie it together myModel = Model(inputs=visible, outputs=decoder1) # summarize layers print(myModel.summary()) #sequence = tmp myModel.compile(optimizer='adam', loss='mse') history = myModel.fit(sequence, sequence, epochs=400, verbose=0, validation_split=0.1, shuffle=True) plot_model(myModel, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png') # demonstrate recreation yhat = myModel.predict(sequence, verbose=0) # yhat import matplotlib.pyplot as plt #plot our loss plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('model train vs validation loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'validation'], loc='upper right') plt.show() </code></pre> <a href="https://i.stack.imgur.com/De3hEm.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/De3hEm.png" alt="enter image description here"/></a> 让我们构建自动编码器 <pre><code># use our encoded layer to encode the training input decoder_layer = myModel.layers[1] encoded_input = Input(shape=(9, 1)) decoder = Model(encoded_input, decoder_layer(encoded_input)) # we are interested in seeing how the encoded sequences with lenght 2 (same as the dimension of the encoder looks like) out = decoder.predict(sequence) f = plt.figure() myx = out[:,0] myy = out[:,1] s = plt.scatter(myx, myy) for i, txt in enumerate(out[:,0]): plt.annotate(i+1, (myx[i], myy[i])) </code></pre> 下面是序列的表示 <a href="https://i.stack.imgur.com/tvzUxm.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/tvzUxm.png" alt="enter image description here"/></a>

LSTM自动编码器

1 个回答

相关Python问题