将每日数据汇总到每周数据框中

2024-05-15 21:44:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用一个每周数据框架,我想在其中总结每日数据

我有两个数据帧:

  • df:由开始日期组成,开始日期是一周开始的日期。Woy是一年中的周数。(每周数据)
  • df_school_vac:学校法语假期的日期(每日数据)

我想要的是下面没有假和真的双引号。 enter image description here

df:

{'start_date': {0: Timestamp('2018-11-05 00:00:00'),
  1: Timestamp('2018-11-12 00:00:00'),
  2: Timestamp('2018-11-19 00:00:00'),
  3: Timestamp('2018-11-26 00:00:00'),
  4: Timestamp('2018-12-03 00:00:00'),
  5: Timestamp('2018-12-10 00:00:00'),
  6: Timestamp('2018-12-17 00:00:00'),
  7: Timestamp('2018-12-24 00:00:00'),
  8: Timestamp('2018-12-31 00:00:00'),
  9: Timestamp('2019-01-07 00:00:00'),
  10: Timestamp('2019-01-14 00:00:00'),
  11: Timestamp('2019-01-21 00:00:00'),
  12: Timestamp('2019-01-28 00:00:00')},
 'woy': {0: 45,
  1: 46,
  2: 47,
  3: 48,
  4: 49,
  5: 50,
  6: 51,
  7: 52,
  8: 1,
  9: 2,
  10: 3,
  11: 4,
  12: 5}}

df_学校_vac:

{'timestamp_area_A': {0: Timestamp('2018-12-22 00:00:00'),
  1: Timestamp('2018-12-23 00:00:00'),
  2: Timestamp('2018-12-24 00:00:00'),
  3: Timestamp('2018-12-25 00:00:00'),
  4: Timestamp('2018-12-26 00:00:00'),
  5: Timestamp('2018-12-27 00:00:00'),
  6: Timestamp('2018-12-28 00:00:00'),
  7: Timestamp('2018-12-29 00:00:00'),
  8: Timestamp('2018-12-30 00:00:00'),
  9: Timestamp('2018-12-31 00:00:00'),
  10: Timestamp('2019-01-01 00:00:00'),
  11: Timestamp('2019-01-02 00:00:00'),
  12: Timestamp('2019-01-03 00:00:00'),
  13: Timestamp('2019-01-04 00:00:00'),
  14: Timestamp('2019-01-05 00:00:00'),
  15: Timestamp('2019-01-06 00:00:00')},
 'vacation_name': {0: 'Vacances de Noël',
  1: 'Vacances de Noël',
  2: 'Vacances de Noël',
  3: 'Vacances de Noël',
  4: 'Vacances de Noël',
  5: 'Vacances de Noël',
  6: 'Vacances de Noël',
  7: 'Vacances de Noël',
  8: 'Vacances de Noël',
  9: 'Vacances de Noël',
  10: 'Vacances de Noël',
  11: 'Vacances de Noël',
  12: 'Vacances de Noël',
  13: 'Vacances de Noël',
  14: 'Vacances de Noël',
  15: 'Vacances de Noël'},
 'woy': {0: 51,
  1: 51,
  2: 52,
  3: 52,
  4: 52,
  5: 52,
  6: 52,
  7: 52,
  8: 52,
  9: 1,
  10: 1,
  11: 1,
  12: 1,
  13: 1,
  14: 1,
  15: 1}}

Tags: 数据no框架dfdestarttimestamp学校
1条回答
网友
1楼 · 发布于 2024-05-15 21:44:15

考虑在 dfl ChansVac 星期一周开始计数,然后使用周水平DF:运行左结合^ {CD2>}:

agg_df = (df_school_vac.groupby(['vacation_name', 
                                 pd.Grouper(key='timestamp_area_A', freq='W-MON')])
                       .count()
                       .reset_index()
                       .set_axis(['holiday_school_name', 'start_date', 'holiday_school_count'], 
                                 axis='columns', inplace=False)
         )


final_df = (pd.merge(df, agg_df, how='left', on=['start_date'])
              .assign(holiday_school = lambda x: np.where(pd.isnull(x['holiday_school_name']), 
                                                          False, True))
           )

print(final_df)

#    start_date  woy holiday_school_name  holiday_school_count  holiday_school
# 0  2018-11-05   45                 NaN                   NaN           False
# 1  2018-11-12   46                 NaN                   NaN           False
# 2  2018-11-19   47                 NaN                   NaN           False
# 3  2018-11-26   48                 NaN                   NaN           False
# 4  2018-12-03   49                 NaN                   NaN           False
# 5  2018-12-10   50                 NaN                   NaN           False
# 6  2018-12-17   51                 NaN                   NaN           False
# 7  2018-12-24   52    Vacances de Noel                   3.0            True
# 8  2018-12-31    1    Vacances de Noel                   7.0            True
# 9  2019-01-07    2    Vacances de Noel                   6.0            True
# 10 2019-01-14    3                 NaN                   NaN           False
# 11 2019-01-21    4                 NaN                   NaN           False
# 12 2019-01-28    5                 NaN                   NaN           False

相关问题 更多 >