Python-pandas时间序列w/分层索引和滚动/移位

closePrice datetime ticker 2014-02-04 09:15:00 AAPL xxx EQIX xxx FB xxx GOOG xxx MSFT xxx 2014-02-04 09:30:00 AAPL xxx EQIX xxx FB xxx GOOG xxx MSFT xxx 2014-02-04 09:45:00 AAPL xxx EQIX xxx FB xxx GOOG xxx MSFT xxx

def f12ema(x): K = 2 / (12 + 1) return x['price_nom'] * K + x['12ema'].shift(-1) * (1-K) df1.apply(f12ema, axis=1) AttributeError: ("'numpy.float64' object has no attribute 'shift'", u'occurred at index 2014-02-05 11:45:00')

1条回答

网友

1楼 · 发布于 2024-04-20 07:42:52

创建包含所需时间的日期范围

In [47]: rng = date_range('20130102 09:30:00','20130105 16:00:00',freq='15T')

In [48]: rng = rng.take(rng.indexer_between_time('09:30:00','16:00:00'))

In [49]: rng
Out[49]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-02 09:30:00, ..., 2013-01-05 16:00:00]
Length: 108, Freq: None, Timezone: None

创建一个类似于你的框架（2000个股票代码x日期）

^{pr2}$

重新排序的水平，使其股票指数x日期，排序它！！！！在

In [51]: df = df.swaplevel('ticker','date').sortlevel()

In [52]: df
Out[52]: 
                               close
ticker date                         
0      2013-01-02 09:30:00  0.840767
       2013-01-02 09:45:00  1.808101
       2013-01-02 10:00:00 -0.718354
       2013-01-02 10:15:00 -0.484502
       2013-01-02 10:30:00  0.563363
       2013-01-02 10:45:00  0.553920
       2013-01-02 11:00:00  1.266992
       2013-01-02 11:15:00 -0.641117
       2013-01-02 11:30:00 -0.574673
       2013-01-02 11:45:00  0.861825
       2013-01-02 12:00:00 -1.562111
       2013-01-02 12:15:00 -0.072966
       2013-01-02 12:30:00  0.673079
       2013-01-02 12:45:00  0.766105
       2013-01-02 13:00:00  0.086202
       2013-01-02 13:15:00  0.949205
       2013-01-02 13:30:00 -0.381194
       2013-01-02 13:45:00  0.316813
       2013-01-02 14:00:00 -0.620176
       2013-01-02 14:15:00 -0.193126
       2013-01-02 14:30:00 -1.552111
       2013-01-02 14:45:00  1.724429
       2013-01-02 15:00:00 -0.092393
       2013-01-02 15:15:00  0.197763
       2013-01-02 15:30:00  0.064541
       2013-01-02 15:45:00 -1.574853
       2013-01-02 16:00:00 -1.023979
       2013-01-03 09:30:00 -0.079349
       2013-01-03 09:45:00 -0.749285
       2013-01-03 10:00:00  0.652721
       2013-01-03 10:15:00 -0.818152
       2013-01-03 10:30:00  0.674068
       2013-01-03 10:45:00  2.302714
       2013-01-03 11:00:00  0.614686

                                 ...

[216000 rows x 1 columns]

按股票行情分组。返回滚动平均值和ewma应用程序的每个股票代码的数据帧。请注意，有许多选项可用于控制这一点，例如，窗口化，您可以使其不环绕几天，等等

In [53]: df.groupby(level='ticker')['close'].apply(lambda x: concat({ 'spma' : pd.rolling_mean(x,3), 'ema' : pd.ewma(x,3) }, axis=1))
Out[53]: 
                                 ema      spma
ticker date                                   
0      2013-01-02 09:30:00  0.840767       NaN
       2013-01-02 09:45:00  1.393529       NaN
       2013-01-02 10:00:00  0.480282  0.643504
       2013-01-02 10:15:00  0.127447  0.201748
       2013-01-02 10:30:00  0.270334 -0.213164
       2013-01-02 10:45:00  0.356580  0.210927
       2013-01-02 11:00:00  0.619245  0.794758
       2013-01-02 11:15:00  0.269100  0.393265
       2013-01-02 11:30:00  0.041032  0.017067
       2013-01-02 11:45:00  0.258476 -0.117988
       2013-01-02 12:00:00 -0.216742 -0.424986
       2013-01-02 12:15:00 -0.179622 -0.257750
       2013-01-02 12:30:00  0.038741 -0.320666
       2013-01-02 12:45:00  0.223881  0.455406
       2013-01-02 13:00:00  0.188995  0.508462
       2013-01-02 13:15:00  0.380972  0.600504
       2013-01-02 13:30:00  0.188987  0.218071
       2013-01-02 13:45:00  0.221125  0.294942
       2013-01-02 14:00:00  0.009907 -0.228185
       2013-01-02 14:15:00 -0.041013 -0.165496
       2013-01-02 14:30:00 -0.419688 -0.788471
       2013-01-02 14:45:00  0.117299 -0.006936

       2013-01-04 10:00:00 -0.060415  0.341013
       2013-01-04 10:15:00  0.074068  0.604611
       2013-01-04 10:30:00 -0.108502  0.440256
       2013-01-04 10:45:00 -0.514229 -0.636702
                                 ...       ...

[216000 rows x 2 columns]

相当不错的表现，因为它基本上是循环的股票。在

In [54]: %timeit df.groupby(level='ticker')['close'].apply(lambda x: concat({ 'spma' : pd.rolling_mean(x,3), 'ema' : pd.ewma(x,3) }, axis=1))
1 loops, best of 3: 2.1 s per loop

相关问题更多 >

编程相关推荐

热门问题

热门文章