在python的ARIMAX中使用外生变量进行预测

2024-06-16 13:19:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用几个宏观经济变量来预测一个叫做收益率差的变量——“yieldsp”。“yieldsp”是数据帧中名为“stat2”的列,带有日期时间索引。最初,我使用ARIMA模型预测了“yieldsp”,其中我使用了以下代码:

# fit the model on the train set and generate prediction for each element on the test set.
# perform a rolling forecast : re-create the ARIMA forecast when each new observation is received. 
# forecast(): performs a one-step forecast from the model
# history - list created to track all the observations seeded with the training set
# => after each iteration, all new observations are appended to the list "history",

yieldsp = stat2["yieldsp"]

X = yieldsp.values
size = int(len(X) * 0.95)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()

# walk-forward validation
for t in range(len(test)):
    model = ARIMA(history, order=(1,0,4))
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))

它起作用并产生预测值和预期值。部分结果如下所示:

predicted=0.996081, expected=0.960000
predicted=0.959644, expected=0.940000
predicted=0.937272, expected=0.930000
predicted=0.932651, expected=0.970000
predicted=0.976372, expected=0.960000
predicted=0.961283, expected=0.940000

但现在,我想用相乘的变量来预测产量。“stat2”中的这些变量是:

yieldsp = stat2[['ffr', 'house_st_change','rwage', 'epop_diff2','ipi_change_diff2', 'sahm_diff2']]


     ffr        house_st_change   rwage   epop_diff2  ipi_change_diff2     sahm_diff2    yieldsp
Date                            
1982-03-31  14.68   -28.713629  0.081837    -4.000000e-01   -3.614082   0.227545    0.19
1982-04-30  14.94   -32.573529  0.081789    2.000000e-01    0.838893    -0.061298   0.72
1982-05-31  14.45   -10.087719  0.081752    2.000000e-01    -0.765399   -0.062888   1.74
1982-06-30  14.15   -13.684211  0.080928    -2.000000e-01   0.421589    -0.039439   1.08
1982-07-31  12.59   12.007685   0.081026    -1.421085e-14   -0.141606   -0.032772   3.11

因此,我尝试了以下几点:

yieldsp = stat2[['ffr', 'house_st_change','rwage', 'epop_diff2','ipi_change_diff2', 'sahm_diff2', 'yieldsp']]

X = yielsp.values
size = int(len(X) * 0.8)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()

# walk-forward validation
for t in range(len(test)):
    model = ARIMA(history, order=(1,0,1))
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))

但出现了错误:

ValueError: could not broadcast input array from shape (7) into shape (1)

我不知道如何解决这个问题。我认为要预测“yieldsp”,我们也需要外生变量的预测值。我还认为我们需要修改代码,其中规定:

history = [x for x in train]
predictions = list()

# walk-forward validation
for t in range(len(test)):
    model = ARIMA(history, order=(1,0,4))

我将感谢任何帮助

(参考:https://machinelearningmastery.com/make-sample-forecasts-arima-python/


Tags: theintestforsizemodellentrain