正确解释statsmodels.tsa.ar_models.ar_选择顺序函数数组以确定最佳滞后

from statsmodels.tsa.ar_model import AutoReg, ar_select_order df = pd.read_csv('Data\uspopulation.csv', index_col='DATE', parse_dates=True) df.index.freq = 'MS' train_data = df.iloc[:84] test_data = df.iloc[84:] modelp = ar_select_order(train_data['PopEst'], maxlag=12)

1条回答

网友

1楼 · 发布于 2024-04-26 02:37:27

The code above returns a numpy array of [ 1 2 3 4 5 6 7 8 9 10 11 12], which I am interpreting as "The optimal lag p is 12" as per this StackOverflow question: stackoverflow.

是的，没错。它返回数组而不仅仅是12的原因是，如果设置glob=True，它还可以搜索不包含所有滞后的模型。例如[ 1 2 3 12]可能是具有某种年度季节模式的月度模型的常见结果

However, evaluating on some metrics (RMSE) I find that my AutoReg fitted models with maxlag=12 are performing worse than lower order models. By trial and error I found that the optimal lag is 8. So I am having difficulty interpreting the resulting numpy array, I have been reading the resources on statsmodels.com/ar_select_order and statsmodels.com/autoregressions but they have not made it clearer.

此函数返回使用information criteria判断为最优的模型。特别是，默认值为BIC or Bayesian information criterion。如果您使用其他标准，例如最小化样本外RSME，那么肯定有可能发现不同的模型被判断为最优

相关问题更多 >

编程相关推荐

热门问题

热门文章