SARIMAX walkforward预测似乎每个tim关闭1个周期

2024-04-18 01:03:06 发布

您现在位置:Python中文网/ 问答频道 /正文

从视觉上看,我对SARIMAX的预测似乎偏离了1个周期(晚了),但我一辈子都搞不清楚原因。从我写的内容来看,它应该在循环迭代的测试数据所在的日期/时间绘制预测。训练数据总是比测试指标晚1个周期。你知道吗

测试和预测之间的RMSE报告为0.378。这是个坏结果吗?如果图表看起来不那么显眼的话,我就不会问这些了。你知道吗

enter image description here

# timeframe : custom class that holds (among other things):
#    Frequency (pandas-compatible string representing periodicity)
#    Data (pandas dataframe where cols = close, open, high, low, volume, rsi; indexes = symbol, time)
#    Sarimax (dict that holds key (symbol) => value (dict of ARIMA order tuples)) generated by earlier script
# s : symbol name (ex., 'SPY')

forecasts = 25
def forecast_data(timeframe, series):
    data = series.asfreq(timeframe.Frequency)
    data.interpolate(inplace=True)
    # Limit datasize due to processing time (some models may fail due to too few nobs!)
    data = data.tail(1000)
    horizon = len(data) - forecasts
    return data[:horizon], data[horizon:]

# Datasets (training, testing, exog)
data = timeframe.Data.loc[s].close
training, testing = forecast_data(timeframe, data)

# Am I using exog correctly? I want to incorporate RSI into the predictive model
exog = timeframe.Data.loc[s].rsi
exog_training, exog_testing = forecast_data(timeframe, exog)

# Walk-Forward Forecasting
predictions = testing.copy(deep=True)
i = 1
print("{} SARIMAX {}x{} : {} rows".format(timeframe, timeframe.Sarimax[s]['order'], timeframe.Sarimax[s]['seasonal_order'], len(training)))
for index, value in testing.iteritems():
    print("   Forecasting {}/{} ({} @ {})".format(i, forecasts, '%.4f' % value, index), end='\r')
    # Fit Model
    fit = SARIMAX(training, order=timeframe.Sarimax[s]['order'], seasonal_order=timeframe.Sarimax[s]['seasonal_order'], enforce_stationarity=False, enforce_invertibility=False, exog=exog_training).fit()
    # one step forecast at current testing date from past training data
    # Am I using exog correctly here?
    predictions.loc[index] = fit.forecast(exog=pd.DataFrame(exog_training.tail(1))).iloc[0]
    # move testing data into training data for the next fit + forecast
    training.loc[index] = value
    exog_training[index] = exog_testing[index]
    i += 1
print('')

# Data/Fit Comparison
plt.figure(figsize=(16, 5))
plt.xlabel("Timeframe: {}".format(timeframe))
plt.ylabel("Price")
# Trim training plot for better visual inspection
training = training[-forecasts:]
plt.ylim(bottom=min(training), top=max(training))
training.plot(label=s + " Actuals", marker='o')
predictions.plot(label=s + " Predictions", marker='o')
plt.legend(loc='upper left')
ax = plt.gca()
ax.grid(which='major', alpha=0.5, linestyle='--')
ax.grid(which='minor', alpha=0.5, linestyle=':')
plt.show()
print(fit.summary())
fit.plot_diagnostics()
plt.show()

我会把这个贴在十字架上,但那个地方感觉像个鬼城。你知道吗


Tags: dataindexvaluetrainingorderplttestingloc

热门问题