层顺序的输入与层不兼容：LSTM中的形状错误

2条回答

网友

1楼 · 编辑于 2024-04-20 08:55:19

这是一个使用LSTM解决的多元回归问题。在开始编写代码之前，让我们先看看它的含义

问题陈述：

在k天内，您每天有5功能holidays, day_of_month, day_of_week,month,quarter
对于任何一天n，假设最后的'm'天的特征，您希望预测第y天的n

正在创建窗口数据集：

我们首先需要决定我们想要输入模型的天数。这称为序列长度（在本例中，让我们将其固定为3）
我们必须分割序列长度的天数来创建训练和测试数据集。这是通过使用滑动窗口完成的，其中窗口大小为序列长度
如您所见，最后的p记录没有可用的预测，其中p是序列长度
我们将使用timeseries_dataset_from_array方法创建窗口数据集
有关更多高级资料，请参见官方tfdocs

LSTM模型

因此，我们希望达到的效果如下：

对于每个LSTM单元展开，我们传入当天的5个特征，并在m时间内展开，其中m是序列长度。我们正在预测最后一天的天气

代码：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

# Model
regressor =  models.Sequential()
regressor.add(layers.LSTM(5, return_sequences=True))
regressor.add(layers.Dense(1))
regressor.compile(optimizer='sgd', loss='mse')

# Dummy data
n = 10000
df = pd.DataFrame(
    {
      'y': np.arange(n),
      'holidays': np.random.randn(n),
      'day_of_month': np.random.randn(n),
      'day_of_week': np.random.randn(n),
      'month': np.random.randn(n),
      'quarter': np.random.randn(n),     
    }
)

# Train test split
train_df, test_df = train_test_split(df)
print (train_df.shape, test_df.shape)\

# Create y to be predicted 
# given last n days predict todays y

# train data
sequence_length = 3
y_pred = train_df['y'][sequence_length-1:].values
train_df = train_df[:-2]
train_df['y_pred'] = y_pred

# Validataion data
y_pred = test_df['y'][sequence_length-1:].values
test_df = test_df[:-2]
test_df['y_pred'] = y_pred

# Create window datagenerators

# Train data generator
train_X = train_df[['holidays','day_of_month','day_of_week','month','month']]
train_y = train_df['y_pred']
train_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    train_X, train_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Validation data generator
test_X = test_df[['holidays','day_of_month','day_of_week','month','month']]
test_y = test_df['y_pred']
test_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    test_X, test_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Finally fit the model
regressor.fit(train_dataset, validation_data=test_dataset, epochs=3)

输出：

(7500, 6) (2500, 6)
Epoch 1/3
1874/1874 [==============================] - 8s 3ms/step - loss: 9974697.3664 - val_loss: 8242597.5000
Epoch 2/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8367530.7117 - val_loss: 8256667.0000
Epoch 3/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8379048.3237 - val_loss: 8233981.5000
<tensorflow.python.keras.callbacks.History at 0x7f3e94bdd198>

网友

2楼 · 编辑于 2024-04-20 08:55:19

输入：

问题是，您的模型需要一个形状为(batch, sequence, features)的3D输入，但您的X_train实际上是一个数据帧切片，因此2D数组：

X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
X_train, y_train =X1, y1

我假设您的列应该是您的特性，所以您通常会对df进行“堆栈切片”，以便X_train看起来像这样：

以下是形状(15,5)的虚拟2D数据集：

data = np.zeros((15,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

您可以对其进行重塑以添加批次维度，例如(15,1,5)：

data = data[:,np.newaxis,:] 

array([[[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]]])

数据相同，但呈现方式不同。现在在这个例子中，batch = 15和sequence = 1，我不知道在你的例子中序列的长度是多少，但它可以是任何东西

型号：

现在在您的模型中，kerasinput_shapeexpect(batch, sequence, features)，当您传递此消息时：

input_shape=(X_train.shape[1], 1)

这就是您的模型所看到的：(None, Sequence = X_train.shape[1] , num_features = 1)None用于批处理维度。我不认为这就是你想要做的，一旦你改变了形状，你也应该修正input_shape以匹配新的数组

问题陈述：

正在创建窗口数据集：

LSTM模型

代码：

相关问题更多 >

编程相关推荐

热门问题

热门文章