Pandas:在复杂问题组中迭代并插入带条件的列

2024-04-19 20:54:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个相当复杂的问题,关于如何为每个组添加一个带有条件的新列。下面是示例数据帧

df = pd.DataFrame({
    'id': ['AA', 'AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'BB', 'BB',
           'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'],
    'From_num': [80, 68, 751, 'Issued', 32, 68, 126, 'Issued', 'Missed', 105, 68, 114, 76, 68, 99, 'Missed'],
    'To_num':[99, 80, 68, 751, 105, 32, 68, 126, 49, 324, 105, 68, 114, 76, 68, 99],
})
    id From_num  To_num
0   AA       80      99
1   AA       68      80
2   AA      751      68
3   AA   Issued     751
4   BB       32     105
5   BB       68      32
6   BB      126      68
7   BB   Issued     126
8   BB   Missed      49
9   CC      105     324
10  CC       68     105
11  CC      114      68
12  CC       76     114
13  CC       68      76
14  CC       99      68
15  CC   Missed      99

我有一个68号旗。在每个组中,“From_num”列中等于或高于此标志号的任何行将在新列中标记为“Forward”,而“To_num”列中等于或低于此标志号的任何行将在同一列中标记为“Back”。但是,最困难的情况是:如果此标志号在每列中出现不止一次,“From_num”和“To_num”之间的行将在新列中标记为“Forward&;Back”,请参见下面的df和预期结果

Expected result
    id From_num  To_num     Direction
0   AA       80      99       Forward
1   AA       68      80       Forward
2   AA      751      68          Back
3   AA   Issued     751          Back
4   BB       32     105       Forward
5   BB       68      32       Forward
6   BB      126      68          Back
7   BB   Issued     126          Back
8   BB   Missed      49          Back
9   CC      105     324       Forward
10  CC       68     105       Forward 
11  CC      114      68  Forward&Back # From line 11 to 13, flag # 68 appears more than once
12  CC       76     114  Forward&Back # so the line 11, 12 and 13 labelled "Forward&Back"
13  CC       68      76  Forward&Back 
14  CC       99      68          Back 
15  CC   Missed      99          Back

我试着写了很多循环,但都失败了,不能得到预期的结果。所以如果有人有想法,请帮忙。希望问题是清楚的。非常感谢


Tags: tofrom标记iddf标志backnum
1条回答
网友
1楼 · 发布于 2024-04-19 20:54:39

我没有“真正的循环”

  1. 保留行号(reset_index()
  2. 构造一个新的数据帧,该数据帧是包含标志的记录(68)
  3. “前进”和“后退”的简单逻辑是基于第一次看到68之前或之后的行
  4. “前向和后向”发生在多次观测以及第2次和第(n-1)次观测之间
def direction(r):
    flagrow = df2[(df2["id"]==r["id"]) ]["index"].values
    if r["index"] <= flagrow[0]: val = "Forward"
    elif r["index"] > flagrow[0]: val = "Back"
    if len(flagrow)>2 and r["index"] >= flagrow[1] and r["index"]<flagrow[-1]: val = "Forward&Back"

    return val

df = pd.DataFrame({
    'id': ['AA', 'AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'BB', 'BB',
           'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'],
    'From_num': [80, 68, 751, 'Issued', 32, 68, 126, 'Issued', 'Missed', 105, 68, 114, 76, 68, 99, 'Missed'],
    'To_num':[99, 80, 68, 751, 105, 32, 68, 126, 49, 324, 105, 68, 114, 76, 68, 99],
})
df = df.reset_index()
df2 = df[(df.From_num==68) | (df.To_num==68)].copy()
df["Direction"] = df.apply(lambda r: direction(r), axis=1)
df

相关问题 更多 >