基于多列值的条件

2024-04-28 22:54:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个像这样的数据框

Dataframe

我想添加一列“Correct_entry”,根据当前列中的值组合返回“True”或“False”。 我总共有24种可能的正确组合。我举两个例子:

一,

If df['Hazard_type'] == 'Drought' & df['Card_type'] == 'Red block' & If['Round'] == 1 & df['Scenario'] == 'scenario A' & df['Payment_type'] == 'One payment (lump sum)' 
If df['Hazard_type'] == 'Drought' & df['Card_type'] == 'Green block' & If['Round'] == 1 & df['Scenario'] == 'scenario A' & df['Payment_type'] == 'One payment (lump sum)' 

我有24种不同的值组合,它们都是正确的,应该等于“真”。所有其他组合应等于“False”

浏览这组数据的最佳方式是什么?我怎样才能把这些不同的陈述结合起来呢

我希望这是清楚的

编辑: 根据要求,以文本格式显示数据

    Hazard_type  Card_type   Round  Scenario    Payment_type
244 Drought      Green block    2   scenario B  Two payments (two consecutive sums)
643 Drought      Red block      4   scenario A  Two payments (two consecutive sums)
584 Drought      Red block      4   scenario A  One payment (lump sum)
242 Drought      Red block      2   scenario B  Two payments (two consecutive sums)
1039 Drought     Green block    6   scenario A  Two payments (two consecutive sums)
101 Flood        Red block      1   scenario A  Two payments (two consecutive sums)

Tags: 数据dfiftyperedcardblockhazard
2条回答

我将根据场景创建一系列布尔掩码来实现这一点

# boolean masks
drought_mask = df['Hazard_type'] == 'Drought'
red_block_mask = df['Card_type'] == 'Red block'
green_block_mask = df['Card_type'] == 'Green block'
round_1_mask = df['Round'] == 1
scenario_a_mask = df['Scenario'] == 'scenario A'
one_payment_mask = df['Payment_type'] == 'One payment (lump sum)'


# scenerio 1
scenario_1_mask = df.loc[(drought_mask & red_block_mask & scenario_a_mask) & (round_1_mask & one_payment_mask & one_payment_mask)]
# scenario 2
scenario_2_mask = df.loc[(drought_mask & green_block_mask & scenario_a_mask) & (round_1_mask & one_payment_mask & one_payment_mask)]

# combining scenerios
df['Correct_entry'] = df.loc[scenario_1_mask | scenario_2_mask]

您可以为每个场景创建一个掩码,然后使用OR运算符(|)将它们组合到最后一个正确的_条目列中。这应该为与24种场景之一匹配的每一行返回一个True

  • 生成数据以匹配您的DF
  • 使用np.select()返回匹配的条件,或者使用np.nan返回透明度
  • 然后简单地进入bool
s = 200
df = pd.DataFrame({"Hazard_type":np.random.choice(["Drought","Flood"],s),
             "Card_type":np.random.choice(["Red block","Green block"],s),
             "Round":np.random.randint(1,7,s),
             "Scenario":np.random.choice(["scenario A","scenario B"],s),
             "Payment_type":np.random.choice(["One payment (lump sum)","Two payments"],s)})

conditions = [
    # condition 0
((df['Hazard_type'] == 'Drought') & (df['Card_type'] == 'Red block') & (df['Round'] == 1) 
 & (df['Scenario'] == 'scenario A') & (df['Payment_type'] == 'One payment (lump sum)')
),    
    # condition 1
((df['Hazard_type'] == 'Drought') & (df['Card_type'] == 'Green block') & (df['Round'] == 1) & 
 (df['Scenario'] == 'scenario A') & (df['Payment_type'] == 'One payment (lump sum)')
)]

df = df.assign(Correct_case=np.select(conditions, [c for c in range(len(conditions))], np.nan),
          Correct_entry=lambda dfa: ~dfa.Correct_case.isna())

正确_项的示例输出==True

^{tb1}$

相关问题 更多 >