有没有一种方法可以将pandas.isin()函数用于多个列表?

2024-05-13 08:38:20 发布

您现在位置:Python中文网/ 问答频道 /正文

    classes money    mask
student         
6   psp   -26.23    adj|psp  # keep
6   tnt   0.00      adj|psp
6   nvm   -87.42    adj|psp
6   alw   0.00      adj|psp
6   tnt   -14.87    adj|psp
6   adj   9.55      adj|psp  # keep
6   psp   -18.56    adj|psp  # keep
6   wzi   -17.41    adj|psp
6   nvm   65.97     adj|psp
6   psp   27.41     adj|psp  # keep
89  alw   -180.33   alw      # keep
89  alw   -869.44   alw      # keep
89  tnt   -95.66    alw
89  wzi   -35.43    alw
89  nvm   -144.90   alw
89  alw   180.23    alw      # keep
89  tnt   35.43     alw
105 tnt   -24.01    tnt      # keep
105 wzi   213.13    tnt
105 ins   -26.06    tnt
105 tnt   60.81     tnt      # keep

我想过滤这个数据框,这样我就只有与掩码对应的每个帐户的行了。例如,对于student 6,我只想要类为“adj”或“psp”的行

我试着去做。isin(),但我使用的列表不断变化,所以有没有关于如何最有效地完成这项工作的想法

最后,我希望得到这些类的总和,并标记大于零的类。如果可能的话,我想把下面的循环变成更有效的循环。谢谢

bad_list = []

for i in test.index.unique():
    df_ = test.loc[i]
    mask_class = df_["mask"].str.split("|")
    mask_class = mask_class.iloc[0]
    df_mask = df_[df_['classes'].isin(mask_class)]
    if(sum(df_mask["money"])>0):
        bad_list.append(i)
    ```

Tags: dfmaskstudentclassclassesbadkeepmoney
2条回答

如果您的情况因学生而异,您可以在字典中列出所有细节并向groupby申请

condition = {6:['adj','psp'],
         89:['alw'],
        105:['tnt']}

df1.groupby('student').apply(lambda x: x.loc[x['classes'].isin(condition[x['student'].iloc[0]])])

输出:

        student classes money
student             
6   0   6   psp -26.23
    5   6   adj 9.55
    6   6   psp -18.56
    9   6   psp 27.41
89  10  89  alw -180.33
    11  89  alw -869.44
    15  89  alw 180.23
105 17  105 tnt -24.01
    20  105 tnt 60.81

您可以创建一个临时列,该列包含掩码列splitexplode,并仅筛选与临时列的类匹配的行:

(df
  #create temporary column
 .assign(mask_expand = lambda x: x['mask'].str.split('|'))
  # 'explode' column into individual elements per row
 .explode('mask_expand')
  #keep only rows where the values in classes are in mask_expand(the temporary column)
 .query('classes == mask_expand')
  #get rid of the temporary column
 .drop('mask_expand', axis = 1)
 )

    student classes money   mask
 0     6      psp   -26.23  adj|psp
 5     6      adj   9.55    adj|psp
 6     6      psp   -18.56  adj|psp
 9     6      psp   27.41   adj|psp
 10    89     alw   -180.33 alw
 11    89     alw   -869.44 alw
 15    89     alw   180.23  alw
 17    105    tnt   -24.01  tnt
 20    105    tnt   60.81   tnt

相关问题 更多 >