pandas dataframe resample聚合函数使用具有自定义函数的多个列？

2024-04-26 02:51:19 发布

男 | 程序猿一只，喜欢编程写python代码。

下面是一个例子：

# Generate some random time series dataframe with 'price' and 'volume'
x = pd.date_range('2017-01-01', periods=100, freq='1min')
df_x = pd.DataFrame({'price': np.random.randint(50, 100, size=x.shape), 'vol': np.random.randint(1000, 2000, size=x.shape)}, index=x)
df_x.head(10)
                     price   vol
2017-01-01 00:00:00     56  1544
2017-01-01 00:01:00     70  1680
2017-01-01 00:02:00     92  1853
2017-01-01 00:03:00     94  1039
2017-01-01 00:04:00     81  1180
2017-01-01 00:05:00     70  1443
2017-01-01 00:06:00     56  1621
2017-01-01 00:07:00     68  1093
2017-01-01 00:08:00     59  1684
2017-01-01 00:09:00     86  1591

# Here is some example aggregate function:
df_x.resample('5Min').agg({'price': 'mean', 'vol': 'sum'}).head()
                     price   vol
2017-01-01 00:00:00   78.6  7296
2017-01-01 00:05:00   67.8  7432
2017-01-01 00:10:00   76.0  9017
2017-01-01 00:15:00   74.0  6989
2017-01-01 00:20:00   64.4  8078

但是，如果我想提取其他聚合信息依赖于多个列，我可以做什么？

例如，我想在这里再追加两列，分别名为all_up和all_down。

这两列的计算定义如下：

每5分钟，1分钟的抽样价格下降多少次，vol下降多少次，调用此列all_down，它们上升多少次，调用此列all_up。

以下是我对这两列的期望：

                     price   vol  all_up  all_down
2017-01-01 00:00:00   78.6  7296       2         0
2017-01-01 00:05:00   67.8  7432       0         0
2017-01-01 00:10:00   76.0  9017       1         0
2017-01-01 00:15:00   74.0  6989       1         1
2017-01-01 00:20:00   64.4  8078       0         2

此功能依赖于两列。但是在Resampler对象中的agg函数中，它似乎只接受3种函数：

分别应用于每个列的str或函数。
分别应用于每个列的函数的list。
带键的dict与列名匹配。每次仍然只将函数值应用于单个列。

所有这些功能似乎都不能满足我的需要。

Tags：函数 df size np random some all price

0条回答

目前没有回答

pandas dataframe resample聚合函数使用具有自定义函数的多个列？

相关问题更多 >

编程相关推荐

热门问题

热门文章

pandas dataframe resample聚合函数使用具有自定义函数的多个列？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >