Pandas用一个标准过滤多个列

2024-06-09 10:12:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个超过一百列的excel表格。我需要过滤其中的五个,看看哪些列在其中一个单元格中有“no”。是否有一种方法可以使用单个搜索条件筛选多个列,例如:

 no_invoice_filter = df[(df['M1: PL - INVOICED']) & (df['M2: EX - INVOICED']) & (df['M3: TEST DEP - INVOICED']) == 'No']

与每列等于“否”时分别写出相反

上面代码的错误:

TypeError: unsupported operand type(s) for &: 'str' and 'bool'

Tags: 方法notestdfinvoicefilter条件excel
2条回答

你可以做:

df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')]

因此,您实际上传递了一个感兴趣的列的列表,并将这些列与标量值进行比较,如果在任何地方出现“No”之后,则使用any(axis=1)

In [115]:
df = pd.DataFrame({'a':'no', 'b':'yes', 'c':['yes','no','yes','no','no']})
df

Out[115]:
    a    b    c
0  no  yes  yes
1  no  yes   no
2  no  yes  yes
3  no  yes   no
4  no  yes   no

对于any(axis=1),它返回在任何感兴趣的列中不出现的所有行:

In [133]:    
df[(df[['a','c']] == 'no').any(axis=1)]

Out[133]:
    a    b    c
0  no  yes  yes
1  no  yes   no
2  no  yes  yes
3  no  yes   no
4  no  yes   no

您还可以使用掩码,使用dropna删除特定列的NaN行

In [132]:    
df[df[['a','c']] == 'no'].dropna(subset=['c'])

Out[132]:
    a    b   c
1  no  NaN  no
3  no  NaN  no
4  no  NaN  no

您需要对列中至少一个No使用带有^{}的列子集:

df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
      .any(axis=1)]

样品:

df = pd.DataFrame({'M1: PL - INVOICED':['a','Yes','No'],
                   'M2: EX - INVOICED':['Yes','No','b'],
                   'M3: TEST DEP - INVOICED':['s','a','No']})

print (df)
  M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
0                 a               Yes                       s
1               Yes                No                       a
2                No                 b                      No

print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No'))
  M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
0             False             False                   False
1             False              True                   False
2              True             False                    True

print ((df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
          .any(axis=1))
0    False
1     True
2     True
dtype: bool


print (df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')
           .any(1)])

  M1: PL - INVOICED M2: EX - INVOICED M3: TEST DEP - INVOICED
1               Yes                No                       a
2                No                 b                      No

相关问题 更多 >