在Keras中使用LSTM进行多变量时间序列预测（基于未来数据）问题的回答

在Keras中使用LSTM进行多变量时间序列预测（基于未来数据）

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

所以我一直在用Keras来预测一个多元时间序列。该数据集是一个污染数据集。第一列是我想要预测的，其余7列是特性。数据集可在以下位置找到： <a href="https://github.com/sagarmk/Forecasting-on-Air-pollution-with-RNN-LSTM/blob/master/pollution.csv" rel="nofollow noreferrer">https://github.com/sagarmk/Forecasting-on-Air-pollution-with-RNN-LSTM/blob/master/pollution.csv</a> 所以我想做的是在没有“污染”列的测试集上执行以下代码。假设有新的特征数据，但没有污染数据。所以 <ol> <li>如何在没有测试数据的情况下训练模型？（model.fit（））</li> <li>如果没有未来的污染数据，我如何预测新的污染数据？（model.predict（））</li> </ol> 为了简单起见，数据集可以在开始时首先拆分为培训和测试数据集，其中“污染”列从测试数据集中删除 下面是一个简单的代码。在这里，我只需导入并处理数据集 <pre><code>import numpy as np import pandas as pd import matplotlib as plt import seaborn as sns import plotly.express as px # Import data dataset = pd.read_csv("pollution.csv") dataset = dataset.drop(['date'], axis = 1) label, layer = pd.factorize(dataset['wnd_dir']) dataset['wnd_dir'] = pd.DataFrame(label) dataset=dataset.fillna(dataset.mean()) dataset.head() </code></pre> 之后，我规范化数据集 <pre><code>from sklearn.preprocessing import MinMaxScaler values = dataset.values scaler = MinMaxScaler() scaled = scaler.fit_transform(values) scaled[0] </code></pre> 然后将规范化数据转换为监督形式 <pre><code>def to_supervised(dataset,dropNa = True,lag = 1): df = pd.DataFrame(dataset) column = [] column.append(df) for i in range(1,lag+1): column.append(df.shift(-i)) df = pd.concat(column,axis=1) df.dropna(inplace = True) features = dataset.shape[1] df = df.values supervised_data = df[:,:features*lag] supervised_data = np.column_stack( [supervised_data, df[:,features*lag]]) return supervised_data timeSteps = 2 supervised = to_supervised(scaled,lag=timeSteps) pd.DataFrame(supervised).head() </code></pre> 现在，数据集被拆分和转换，以便LSTM网络能够处理它 <pre><code>features = dataset.shape[1] train_hours = round(dataset.shape[0]*0.7) X = supervised[:,:features*timeSteps] y = supervised[:,features*timeSteps] x_train = X[:train_hours,:] x_test = X[train_hours:,:] y_train = y[:train_hours] y_test = y[train_hours:] print(x_train.shape,x_test.shape,y_train.shape,y_test.shape) #convert data to fit for lstm #dimensions = (sample, timeSteps here it is 1, features ) x_train = x_train.reshape(x_train.shape[0], timeSteps, features) x_test = x_test.reshape(x_test.shape[0], timeSteps, features) print(x_train.shape,x_test.shape) </code></pre> 这里的模型是经过训练的 <pre><code>#define the model from keras.models import Sequential from keras.layers import Dense,LSTM model = Sequential() model.add( LSTM( 50, input_shape = ( timeSteps,x_train.shape[2]) ) ) model.add( Dense(1) ) model.compile( loss = "mae", optimizer = "adam") history = model.fit( x_train,y_train, validation_data = (x_test,y_test), epochs = 50 , batch_size = 72, verbose = 0, shuffle = False) plt.pyplot.plot(history.history['loss'], label='train') plt.pyplot.plot(history.history['val_loss'], label='test') plt.pyplot.legend() #plt.pyplot.yticks([]) #plt.pyplot.xticks([]) plt.pyplot.title("loss during training") plt.pyplot.show() </code></pre> 最后，我绘制了训练数据和测试数据 <pre><code>y_pred = model.predict(x_test) x_test = x_test.reshape(x_test.shape[0],x_test.shape[2]*x_test.shape[1]) inv_new = np.concatenate( (y_pred, x_test[:,-7:] ) , axis =1) inv_new = scaler.inverse_transform(inv_new) final_pred = inv_new[:,0] plt.pyplot.figure(figsize=(20,10)) plt.pyplot.plot(dataset['pollution']) plt.pyplot.plot([None for i in dataset['pollution']] + [x for x in final_pred]) plt.pyplot.show() </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

在Keras中使用LSTM进行多变量时间序列预测（基于未来数据）

1 个回答

相关Python问题