基于另一列条件的pandas前向填充和后向填充
我的数据表看起来是这样的
data = {
"ReportName": ["Sample cycle", 'Message', "ID", "m1", "Uncertainty m1", "Message", "Sample cycle", 'Message', "ID", "m0", "Uncertainty m0", "Message", "ID", "m1", "Uncertainty m1", "Message"],
"Values": [ "1","NO", "II", None, None, "NO", "1", "NO", "ID1", "1.8", "0.43", "NO", "ID2", "1.5", "0.41", "NO"],
}
df = pd.DataFrame(data)
我用这个函数创建了一个新的“ID”列
def extract_id(row):
if row['ReportName'] == 'ID':
return row['Values']
return None
现在我想把“Na”填充为“ReportName”等于'Sample cycle'的ID,直到下一个'Sample cycle'。
想要的结果是
ReportName Values ID
0 Sample cycle 1 None
1 Message NO II
2 ID II II
3 m1 None II
4 Uncertainty m1 None II
5 Message NO II
6 Sample cycle 1 None
7 Message NO ID1
8 ID ID1 ID1
9 m0 1.8 ID1
10 Uncertainty m0 0.43 ID1
11 Message NO ID1
12 ID ID2 ID2
13 m1 1.5 ID2
14 Uncertainty m1 0.41 ID2
15 Message NO ID2
1 个回答
0
你可以使用 groupby.transform
结合条件筛选来处理数据:
# identify rows with ID
m1 = df['ReportName'].eq('ID')
# identify rows with "Sample cycle"
# this is used both to form groups
# and to mask the output
m2 = df['ReportName'].eq('Sample cycle')
df.loc[~m2, 'ID'] = (df['Values'].where(m1).groupby(m2.cumsum())
.transform(lambda x: x.ffill().bfill())
)
输出结果:
ReportName Values ID
0 Sample cycle 1 NaN
1 Message NO II
2 ID II II
3 m1 None II
4 Uncertainty m1 None II
5 Message NO II
6 Sample cycle 1 NaN
7 Message NO ID1
8 ID ID1 ID1
9 m0 1.8 ID1
10 Uncertainty m0 0.43 ID1
11 Message NO ID1
12 ID ID2 ID2
13 m1 1.5 ID2
14 Uncertainty m1 0.41 ID2
15 Message NO ID2