groupby中的递增日期

2024-04-19 21:04:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在为员工构建一个基本的rota/时间表,并从MySQL游标中获得一个数据帧,它给出了id、日期和类的列表

        id             the_date  class
0   195593  2017-09-12 14:00:00      3
1   193972  2017-09-13 09:15:00      2
2   195594  2017-09-13 14:00:00      3
3   195595  2017-09-15 14:00:00      3
4   193947  2017-09-16 17:30:00      3
5   195627  2017-09-17 08:00:00      2
6   193948  2017-09-19 11:30:00      2
7   195628  2017-09-21 08:00:00      2
8   193949  2017-09-21 11:30:00      2
9   195629  2017-09-24 08:00:00      2
10  193950  2017-09-24 10:00:00      2
11  193951  2017-09-27 11:30:00      2
12  195644  2017-09-28 06:00:00      1
13  194400  2017-09-28 08:00:00      1
14  195630  2017-09-28 08:00:00      2
15  193952  2017-09-29 11:30:00      2
16  195631  2017-10-01 08:00:00      2
17  194401  2017-10-06 08:00:00      1
18  195645  2017-10-06 10:00:00      1
19  195632  2017-10-07 13:30:00      3

如果类==1,我需要将该实例复制5次。你知道吗

first_class = df[df['class'] == 1]
non_first_class = df[df['class'] != 1]
first_class_replicated = pd.concat([tests_df]*5,ignore_index=True).sort_values(['the_date'])

    id             the_date  class
0   195644  2017-09-28 06:00:00      1
16  195644  2017-09-28 06:00:00      1
4   195644  2017-09-28 06:00:00      1
12  195644  2017-09-28 06:00:00      1
8   195644  2017-09-28 06:00:00      1
17  194400  2017-09-28 08:00:00      1
13  194400  2017-09-28 08:00:00      1
9   194400  2017-09-28 08:00:00      1
5   194400  2017-09-28 08:00:00      1
1   194400  2017-09-28 08:00:00      1
6   194401  2017-10-06 08:00:00      1
18  194401  2017-10-06 08:00:00      1
10  194401  2017-10-06 08:00:00      1
14  194401  2017-10-06 08:00:00      1
2   194401  2017-10-06 08:00:00      1
11  195645  2017-10-06 10:00:00      1
3   195645  2017-10-06 10:00:00      1
15  195645  2017-10-06 10:00:00      1
7   195645  2017-10-06 10:00:00      1
19  195645  2017-10-06 10:00:00      1

然后合并non_first_classfirst_class_replicated。不过,在此之前,我需要first_class_replicated中的日期按id分组递增一天。有没有一个优雅的解决方案,或者我应该考虑通过groupby系列来修改日期?你知道吗

期望值:

id      
0   195644  2017-09-28 6:00:00
16  195644  2017-09-29 6:00:00
4   195644  2017-09-30 6:00:00
12  195644  2017-10-01 6:00:00
8   195644  2017-10-02 6:00:00
17  194400  2017-09-28 8:00:00
13  194400  2017-09-29 8:00:00
9   194400  2017-09-30 8:00:00
5   194400  2017-10-01 8:00:00
1   194400  2017-10-02 8:00:00
6   194401  2017-10-06 8:00:00
18  194401  2017-10-07 8:00:00
10  194401  2017-10-08 8:00:00
14  194401  2017-10-09 8:00:00
2   194401  2017-10-10 8:00:00
11  195645  2017-10-06 10:00:00
3   195645  2017-10-07 10:00:00
15  195645  2017-10-08 10:00:00
7   195645  2017-10-09 10:00:00
19  195645  2017-10-10 10:00:00

Tags: the数据iddf列表datemysql员工
1条回答
网友
1楼 · 发布于 2024-04-19 21:04:42

您可以使用^{}作为计数类别,然后转换^{}并添加到列:

#another solution for repeat
first_class_replicated = first_class.loc[np.repeat(first_class.index, 5)]
                                    .sort_values(['the_date'])

df1 = first_class_replicated.groupby('id').cumcount()
first_class_replicated['the_date'] += pd.to_timedelta(df1, unit='D')
print (first_class_replicated)
        id            the_date  class
0   195644 2017-09-28 06:00:00      1
16  195644 2017-09-29 06:00:00      1
4   195644 2017-09-30 06:00:00      1
12  195644 2017-10-01 06:00:00      1
8   195644 2017-10-02 06:00:00      1
17  194400 2017-09-28 08:00:00      1
13  194400 2017-09-29 08:00:00      1
9   194400 2017-09-30 08:00:00      1
5   194400 2017-10-01 08:00:00      1
1   194400 2017-10-02 08:00:00      1
6   194401 2017-10-06 08:00:00      1
18  194401 2017-10-07 08:00:00      1
10  194401 2017-10-08 08:00:00      1
14  194401 2017-10-09 08:00:00      1
2   194401 2017-10-10 08:00:00      1
11  195645 2017-10-06 10:00:00      1
3   195645 2017-10-07 10:00:00      1
15  195645 2017-10-08 10:00:00      1
7   195645 2017-10-09 10:00:00      1
19  195645 2017-10-10 10:00:00      1

相关问题 更多 >