如何根据不同的日期对Pandas数据帧进行分组?

2024-05-16 23:21:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图将每日数据汇总成财政季度数据。例如,我有一个包含会计季度结束日期的表:

Company Period Quarter_End
M       2016Q1 05/02/2015
M       2016Q2 08/01/2015
M       2016Q3 10/31/2015
M       2016Q4 01/30/2016
WFM     2015Q2 04/12/2015
WFM     2015Q3 07/05/2015 
WFM     2015Q4 09/27/2015
WFM     2016Q1 01/17/2016

以及每日数据表:

^{pr2}$

我想创建下表。在

^{3}$

但是,我不知道如何通过不同的日期分组而不遍历每个记录。非常感谢任何帮助。在

谢谢!在


Tags: 数据记录companyperiodend数据表汇总财政
2条回答

我想你可以用^{}

#first convert columns to datetime
df1.Quarter_End = pd.to_datetime(df1.Quarter_End)
df2.Date = pd.to_datetime(df2.Date)


df = pd.merge_ordered(df1, 
                      df2, 
                      left_on=['Company','Quarter_End'], 
                      right_on=['Company','Date'], 
                      how='outer')
print (df)
   Company  Period Quarter_End       Date  Price
0        M  2016Q1  2015-05-02        NaT    NaN
1        M     NaN         NaT 2015-06-20   1.05
2        M     NaN         NaT 2015-06-22   4.05
3        M     NaN         NaT 2015-07-10   3.45
4        M     NaN         NaT 2015-07-29   1.86
5        M  2016Q2  2015-08-01        NaT    NaN
6        M     NaN         NaT 2015-08-24   1.58
7        M     NaN         NaT 2015-09-02   8.64
8        M     NaN         NaT 2015-09-22   2.56
9        M     NaN         NaT 2015-10-20   5.42
10       M  2016Q3  2015-10-31        NaT    NaN
11       M     NaN         NaT 2015-11-02   1.58
12       M     NaN         NaT 2015-11-24   4.58
13       M     NaN         NaT 2015-12-03   6.48
14       M     NaN         NaT 2015-12-05   4.56
15       M     NaN         NaT 2016-01-03   7.14
16       M  2016Q4  2016-01-30 2016-01-30   6.34
17     WFM  2015Q2  2015-04-12        NaT    NaN
18     WFM     NaN         NaT 2015-06-20   1.05
19     WFM     NaN         NaT 2015-06-22   4.05
20     WFM  2015Q3  2015-07-05        NaT    NaN
21     WFM     NaN         NaT 2015-07-10   3.45
22     WFM     NaN         NaT 2015-07-29   1.86
23     WFM     NaN         NaT 2015-08-24   1.58
24     WFM     NaN         NaT 2015-09-02   8.64
25     WFM     NaN         NaT 2015-09-22   2.56
26     WFM  2015Q4  2015-09-27        NaT    NaN
27     WFM     NaN         NaT 2015-10-20   5.42
28     WFM     NaN         NaT 2015-11-02   1.58
29     WFM     NaN         NaT 2015-11-24   4.58
30     WFM     NaN         NaT 2015-12-03   6.48
31     WFM     NaN         NaT 2015-12-05   4.56
32     WFM     NaN         NaT 2016-01-03   7.14
33     WFM  2016Q1  2016-01-17 2016-01-17   6.34

然后用^{}和{a3}填充NaN列和Quarter_End中的NaN并聚合{a3}。如果需要删除所有NaN值,请添加^{}和最后一个^{}

^{pr2}$
  • set_index
  • pd.concat对齐索引
  • groupby带{}

prd_df = period_df.set_index(['Company', 'Quarter_End'])

prc_df = price_df.set_index(['Company', 'Date'], drop=False)

df = pd.concat([prd_df, prc_df], axis=1)

df.groupby([df.index.get_level_values(0), df.Period.bfill()])  \
  .agg(dict(Date='last', Price='sum')).dropna()

enter image description here

相关问题 更多 >