<p>下面是一个例子</p>
<p>让我们创建一个由几个序列组成的合成数据。这个想法是通过自动编码器的镜头来观察这些序列。换言之,降低维度或将其汇总为固定长度。</p>
<pre><code># define input sequence
sequence = np.array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
[0.2, 0.4, 0.6, 0.8],
[0.3, 0.6, 0.9, 1.2]])
# prepare to normalize
x = pd.DataFrame(sequence.tolist()).T.values
scaler = preprocessing.StandardScaler()
x_scaled = scaler.fit_transform(x)
sequence_normalized = [col[~np.isnan(col)] for col in x_scaled.T]
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')
# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))
</code></pre>
<p>让我们设计一个简单的LSTM</p>
<pre><code>#define encoder
visible = Input(shape=(n_in, 1))
encoder = LSTM(2, activation='relu')(visible)
# define reconstruct decoder
decoder1 = RepeatVector(n_in)(encoder)
decoder1 = LSTM(100, activation='relu', return_sequences=True)(decoder1)
decoder1 = TimeDistributed(Dense(1))(decoder1)
# tie it together
myModel = Model(inputs=visible, outputs=decoder1)
# summarize layers
print(myModel.summary())
#sequence = tmp
myModel.compile(optimizer='adam', loss='mse')
history = myModel.fit(sequence, sequence,
epochs=400,
verbose=0,
validation_split=0.1,
shuffle=True)
plot_model(myModel, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = myModel.predict(sequence, verbose=0)
# yhat
import matplotlib.pyplot as plt
#plot our loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model train vs validation loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper right')
plt.show()
</code></pre>
<p><a href="https://i.stack.imgur.com/De3hEm.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/De3hEm.png" alt="enter image description here"/></a></p>
<p><strong>让我们构建自动编码器</p>
<pre><code># use our encoded layer to encode the training input
decoder_layer = myModel.layers[1]
encoded_input = Input(shape=(9, 1))
decoder = Model(encoded_input, decoder_layer(encoded_input))
# we are interested in seeing how the encoded sequences with lenght 2 (same as the dimension of the encoder looks like)
out = decoder.predict(sequence)
f = plt.figure()
myx = out[:,0]
myy = out[:,1]
s = plt.scatter(myx, myy)
for i, txt in enumerate(out[:,0]):
plt.annotate(i+1, (myx[i], myy[i]))
</code></pre>
<p>下面是序列的表示</p>
<p><a href="https://i.stack.imgur.com/tvzUxm.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/tvzUxm.png" alt="enter image description here"/></a></p>