根据其他列中的实际数据更改列中的日期

2024-06-16 16:52:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2020-12-04  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-03-09  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0

我想更改date_activated列,当它不等于第4列显示的每月付款数据时,依此类推(以月份和年份为单位)。修改后的日期_激活将与列标题相同,即它将更改为最早的付款日期,并且只能包含表示当月第一天的日期)。我只需要保留那些与月份和年份相同的内容

我希望输出如下所示:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2021-12-01  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-02-01  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0

以下是数据帧的字典:

{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2020-12-04 00:00:00'),
  1: Timestamp('2021-03-09 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}

以下是所需输出的字典:

{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2021-12-01 00:00:00'),
  1: Timestamp('2021-02-01 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}

Tags: 数据id标题内容datetimedate字典单位
1条回答
网友
1楼 · 发布于 2024-06-16 16:52:54

您可以使用pandas.Series.first_valid_index()检查datetime列中将0替换为np.nan后的第一个非无值索引

然后使用pandas.DataFrame.where()替换条件为False的值

idx = df.iloc[:, 3:].replace(0, np.nan).T.apply(pd.Series.first_valid_index)

m = (df['date_activated'].dt.year == idx.dt.year) & (df['date_activated'].dt.month == idx.dt.month)

df['date_activated'] = df['date_activated'].where(m, idx)
print(df['date_activated'])

0   2021-12-01
1   2021-02-01
2   2021-04-25
Name: date_activated, dtype: datetime64[ns]

相关问题 更多 >