我有这样一个数据,想创建一个名为'Month'的列
+---------+------------------+------+------+
| Name | Task | Team | Date |
+---------+------------------+------+------+
| John | Market study | A | 1 |
+---------+------------------+------+------+
| Michael | Customer service | B | 1 |
+---------+------------------+------+------+
| Joanna | Accounting | C | 1 |
+---------+------------------+------+------+
| John | Accounting | B | 2 |
+---------+------------------+------+------+
| Michael | Customer service | A | 2 |
+---------+------------------+------+------+
| Joanna | Market study | C | 2 |
+---------+------------------+------+------+
| John | Customer service | C | 1 |
+---------+------------------+------+------+
| Michael | Market study | A | 1 |
+---------+------------------+------+------+
| Joanna | Customer service | B | 1 |
+---------+------------------+------+------+
| John | Market study | A | 2 |
+---------+------------------+------+------+
| Michael | Customer service | B | 2 |
+---------+------------------+------+------+
| Joanna | Accounting | C | 2 |
+---------+------------------+------+------+
所以基本上,我有日期信息,但是日期不包含它所属的月份。但是,我知道如果它第一次出现,那么它将属于第1个月,如果它第二次出现,那么它将属于第2个月。因此,例如,日期1出现3次,然后被日期2打断。因此,前3次属于第1个月,后3次出现,它属于第2个月。所以我希望我的结果如下:
+---------+------------------+------+------+---------+
| Name | Task | Team | Date | Month |
+---------+------------------+------+------+---------+
| John | Market study | A | 1 | Month 1 |
+---------+------------------+------+------+---------+
| Michael | Customer service | B | 1 | Month 1 |
+---------+------------------+------+------+---------+
| Joanna | Accounting | C | 1 | Month 1 |
+---------+------------------+------+------+---------+
| John | Accounting | B | 2 | Month 1 |
+---------+------------------+------+------+---------+
| Michael | Customer service | A | 2 | Month 1 |
+---------+------------------+------+------+---------+
| Joanna | Market study | C | 2 | Month 1 |
+---------+------------------+------+------+---------+
| John | Customer service | C | 1 | Month 2 |
+---------+------------------+------+------+---------+
| Michael | Market study | A | 1 | Month 2 |
+---------+------------------+------+------+---------+
| Joanna | Customer service | B | 1 | Month 2 |
+---------+------------------+------+------+---------+
| John | Market study | A | 2 | Month 2 |
+---------+------------------+------+------+---------+
| Michael | Customer service | B | 2 | Month 2 |
+---------+------------------+------+------+---------+
| Joanna | Accounting | C | 2 | Month 2 |
+---------+------------------+------+------+---------+
我不知道,除了使用一些循环。 谢谢大家
如果我正确理解了这个问题,您可以执行以下操作:创建mask
s
,将每个连续值分隔成单独的组。从s
,为每个组的每个值创建掩码s1
。按s1
和Date
分组并执行cumcount
和map
以创建所需的输出:相关问题 更多 >
编程相关推荐