在Keras上用解码器输入seq2seq模型连接注意层

encoder_input = Input(shape=(MAX_LENGTH_Input, )) embedded = Embedding(input_dim=vocab_size_input, output_dim= embedding_width, trainable=False)(encoder_input) encoder = Bidirectional(LSTM(units= hidden_size, input_shape=(MAX_LENGTH_Input,embedding_width), return_sequences=True, dropout=0.25, recurrent_dropout=0.25))(embedded) attention = Attention(MAX_LENGTH_Input)(encoder) decoder_input = Input(shape=(MAX_LENGTH_Output,vocab_size_output)) merge = concatenate([attention, decoder_input]) decoder = Bidirectional(LSTM(units=hidden_size, input_shape=(MAX_LENGTH_Output,vocab_size_output))(merge)) output = TimeDistributed(Dense(MAX_LENGTH_Output, activation="softmax"))(decoder)

1条回答

网友

1楼 · 发布于 2024-04-29 06:43:11

根据你的方块图，看起来你在每一个时间步都把相同的注意力向量传递给了解码器。在这种情况下，您需要RepeatVector在每个时间步复制相同的注意向量，以将2D注意张量转换为3D张量：

# ...
attention = Attention(MAX_LENGTH_Input)(encoder)
attention = RepeatVector(MAX_LENGTH_Output)(attention) # (?, 10, 1024)
decoder_input = Input(shape=(MAX_LENGTH_Output,vocab_size_output))
merge = concatenate([attention, decoder_input]) # (?, 10, 1024+8281)
# ...

请注意，这将在每个时间步重复相同的注意向量

相关问题更多 >

编程相关推荐

热门问题

热门文章