我有一个专栏'dateTime',我正在努力实现以下目标(没有gropuby也行):
df['time_of_day_10'] = df['dateTime'].dt.floor('10min')
df['time_of_day_30'] = df['dateTime'].dt.floor('30min')
但问题是,在我使用以下方法收集数据之后:
groups = df.groupby(groupbytime,as_index=True)
df_grouped = (groups.agg({
'clients1': [np.mean,np.max,],
'clients2': [np.mean,np.max,],
}))
我失去了我的约会时间,所以我试图添加回来,并添加了以下内容:
groups = df.groupby(groupbytime,as_index=True)
df_grouped = (groups.agg({
'dateTime':['first'],
'clients1': [np.mean,np.max,],
'clients2': [np.mean,np.max,],
}))
这样我就可以知道
dateTime first datetime64[ns]
我正试图在groupby中把四舍五入的时间和日期作为分开的coulmns。 谢谢!你知道吗
添加示例数据: 原始数据:
dateTime Clients1 Clients2
8 2017-10-23 08:00:04.854309 12991.5 2
10 2017-10-23 08:00:04.875162 12991.5 1
11 2017-10-23 08:00:04.875162 12991.5 1
12 2017-10-23 08:00:04.875162 12991.5 1
13 2017-10-23 08:00:04.875162 12991.5 1
23 2017-10-23 08:00:04.876464 12989.5 1
24 2017-10-23 08:00:04.876464 12989.5 1
32 2017-10-23 08:00:04.964356 12990 1
34 2017-10-23 08:00:04.968549 12990.5 1
38 2017-10-23 08:00:05.008758 12990 1
43 2017-10-23 08:00:05.996090 12990 2
45 2017-10-23 08:00:06.018212 12990 1
51 2017-10-23 08:00:06.344568 12989.5 1
56 2017-10-23 08:00:06.903661 12990 1
60 2017-10-23 08:00:07.120324 12990 1
66 2017-10-23 08:00:07.206179 12990.5 1
74 2017-10-23 08:00:07.358889 12991.5 3
77 2017-10-23 08:00:07.491244 12991 1
80 2017-10-23 08:00:07.671106 12991 1
83 2017-10-23 08:00:07.897968 12991 1
87 2017-10-23 08:00:08.028444 12991 1
95 2017-10-23 08:00:09.787827 12991.5 3
98 2017-10-23 08:00:10.178936 12991.5 3
104 2017-10-23 08:00:10.505921 12991.5 2
110 2017-10-23 08:00:11.438628 12992 1
112 2017-10-23 08:00:12.145907 12992 1
结果是:
dateTime Clients1 Clients1 Clients2 Clients2
first mean amax mean amax
1min
2017-10-23 08:00:00 2017-10-23 08:00:04.854309 12988.8902439024 12993.5 227 12987.7398373984
2017-10-23 08:01:00 2017-10-23 08:01:00.005942 12986.92 12988.5 84 12986.28
2017-10-23 08:02:00 2017-10-23 08:02:00.901496 12987.6486486486 12988.5 98 12987
2017-10-23 08:03:00 2017-10-23 08:03:00.521976 12986.8148148148 12987.5 65 12986.1296296296
2017-10-23 08:04:00 2017-10-23 08:04:02.800922 12986.4705882353 12986.5 47 12985.5294117647
2017-10-23 08:05:00 2017-10-23 08:05:00.670865 12985.3658536585 12986 88 12984.7804878049
2017-10-23 08:06:00 2017-10-23 08:06:00.141393 12987.359375 12988 103 12986.734375
2017-10-23 08:07:00 2017-10-23 08:07:00.922107 12987.5454545455 12988 34 12986.7727272727
2017-10-23 08:08:00 2017-10-23 08:08:00.165103 12986.8214285714 12988 46 12986.0714285714
2017-10-23 08:09:00 2017-10-23 08:09:01.910121 12988.96875 12990 145 12988.328125
2017-10-23 08:10:00 2017-10-23 08:10:00.008064 12988.2678571429 12989.5 102 12987.6785714286
2017-10-23 08:11:00 2017-10-23 08:11:05.533862 12989.4318181818 12991 71 12988.8636363636
2017-10-23 08:12:00 2017-10-23 08:12:01.124564 12991.0444444444 12992.5 144 12990.4444444444
2017-10-23 08:13:00 2017-10-23 08:13:00.347987 12992.84375 12995 185 12992.0390625
2017-10-23 08:14:00 2017-10-23 08:14:00.627402 12994.2906976744 12996 216 12993.6395348837
2017-10-23 08:15:00 2017-10-23 08:15:00.032132 12994.8859649123 12996.5 211 12994.298245614
一种可能的解决方案是
floor
之后agg
:编辑:如果需要每个组的最大日期,请使用带有
date
的自定义函数:或者,如果ned max datetime per group use
floor
by dayd
:相关问题 更多 >
编程相关推荐