基于中的约束生成列

网友

1楼 · 编辑于 2024-05-23 22:46:59

我想，我找到了一个非常简洁的解决方案：

df['c_a'] = df.groupby('F_Date').apply(lambda grp:
    25000 - grp.col.where(grp.is_B.eq(1), 0).shift(fill_value=0)
    .cumsum()).reset_index(level=0, drop=True)

结果是：

       F_Date      B_Date   col  is_B    c_a
0  01/09/2019  02/08/2019  2200     1  25000
1  01/09/2019  03/08/2019   672     1  22800
2  02/09/2019  03/08/2019  1828     1  25000
3  01/09/2019  04/08/2019   503     0  22128
4  02/09/2019  04/08/2019   829     1  23172
5  03/09/2019  04/08/2019  1367     0  25000
6  02/09/2019  05/08/2019   559     1  22343
7  03/09/2019  05/08/2019   922     1  25000
8  04/09/2019  05/08/2019  1519     0  25000
9  01/09/2019  06/08/2019   376     1  22128

这个想法，以小组F_Date=='01/09/2019'为例：

grp.col.where(grp.is_B.eq(1), 0)-要从中减去的值组中下一行：
```
0    2200
1     672
3       0
9     376
```
.shift(fill_value=0)-从电流中减去的值组中的行：
```
0       0
1    2200
3     672
9       0
```

.cumsum()-要减去的累积值：

25000 - ...-目标值：

网友
2楼 · 编辑于 2024-05-23 22:46:59

不错的熊猫游戏：）
import pandas as pd df = pd.DataFrame({'F_Date': [pd.to_datetime(_, format='%d/%m/%Y') for _ in ['01/09/2019', '01/09/2019', '02/09/2019', '01/09/2019', '02/09/2019', '03/09/2019', '02/09/2019', '03/09/2019', '04/09/2019', '01/09/2019']], 'B_Date': [pd.to_datetime(_, format='%d/%m/%Y') for _ in ['02/08/2019', '03/08/2019', '03/08/2019', '04/08/2019', '04/08/2019', '04/08/2019', '05/08/2019', '05/08/2019','05/08/2019', '06/08/2019']], 'col': [2200, 672, 1828, 503, 829, 1367, 559, 922, 1519, 376], 'is_B': [1, 1, 1, 0, 1, 0, 1, 1, 0, 1] })
让我们一步一步地看一下：
# sort in the order that fits the semantics of your calculations df.sort_values(['F_Date', 'B_Date'], inplace=True) # initialize 'c_a' to 25000 if a new F_Date starts df.loc[df['F_Date'].diff(1) != pd.Timedelta(0), 'c_a'] = 25000 # Step downwards from every 25000 and substract shifted 'col' # if shifted 'is_B' == 1, otherwise replicate shifted 'c_a' to the next line while pd.isna(df.c_a).any(): df.c_a.where( pd.notna(df.c_a), # set every not-NaN value to ... df.c_a.shift(1).where( # ...the previous / shifted c_a... df.is_B.shift(1) == 0, # ... if previous / shifted is_B == 0 df.c_a.shift(1) - df.col.shift(1) # ... otherwise substract shifted 'col' ), inplace=True ) # restore original order df.sort_index(inplace=True)
这就是我得到的结果
F_Date B_Date col is_B c_a 0 2019-09-01 2019-08-02 2200 1 25000.0 1 2019-09-01 2019-08-03 672 1 22800.0 2 2019-09-02 2019-08-03 1828 1 25000.0 3 2019-09-01 2019-08-04 503 0 22128.0 4 2019-09-02 2019-08-04 829 1 23172.0 5 2019-09-03 2019-08-04 1367 0 25000.0 6 2019-09-02 2019-08-05 559 1 22343.0 7 2019-09-03 2019-08-05 922 1 25000.0 8 2019-09-04 2019-08-05 1519 0 25000.0 9 2019-09-01 2019-08-06 376 1 22128.0

网友
3楼 · 编辑于 2024-05-23 22:46:59

使用shift、cumsum和ffill尝试groupby

m = ~df.groupby('F_Date').is_B.diff().eq(1)
s = (-df.col).groupby(df.F_Date).apply(lambda x: x.shift(fill_value=25000).cumsum())

df['c_a'] = s.where(m).groupby(df.F_Date).ffill()


Out[98]:
       F_Date      B_Date   col  is_B      c_a
0  01/09/2019  02/08/2019  2200     1  25000.0
1  01/09/2019  03/08/2019   672     1  22800.0
2  02/09/2019  03/08/2019  1828     1  25000.0
3  01/09/2019  04/08/2019   503     0  22128.0
4  02/09/2019  04/08/2019   829     1  23172.0
5  03/09/2019  04/08/2019  1367     0  25000.0
6  02/09/2019  05/08/2019   559     1  22343.0
7  03/09/2019  05/08/2019   922     1  25000.0
8  04/09/2019  05/08/2019  1519     0  25000.0
9  01/09/2019  06/08/2019   376     1  22128.0

相关问题更多 >

编程相关推荐

热门问题

热门文章