从Pandas DataFrame中过滤出真实值，返回(Row,Col)元组

1 投票

2 回答

3192 浏览

提问于 2025-04-18 02:34

假设有一个这样的框架：

  a    b     c
1 True False False
2 True True False
3 False True True

我想得到一个像这样的列表：

[(1,a), (2,a), (2,b), (3,b), (3,c)]

也就是说，要把所有值为真的部分过滤掉，然后获取元组（行名，列名）。

数据过滤数据框架布尔索引元组操作

2 个回答

In [27]: df
Out[27]: 
       a      b      c
1   True  False  False
2   True   True  False
3  False   True   True

[3 rows x 3 columns]

你可以用 np.where 来找到对应于 True 的索引：

In [28]: np.where(df)
Out[28]: (array([0, 1, 1, 2, 2]), array([0, 0, 1, 1, 2]))

In [29]: x, y = np.where(df)

这些索引和列都是 ndarrays（就是一种数据结构），你可以用 NumPy 的整数索引来选择标签：

In [30]: df.index[y]
Out[30]: Int64Index([1, 1, 2, 2, 3], dtype='int64')

In [31]: df.columns[x]
Out[31]: Index([u'a', u'b', u'b', u'c', u'c'], dtype='object')

然后可以用 zip 把它们组合在一起：

In [32]: zip(df.index[y], df.columns[x])
Out[32]: [(1, 'a'), (1, 'b'), (2, 'b'), (2, 'c'), (3, 'c')]

回答于 2025-04-18 由 Python大师

分享举报

另一种方法是使用 stack：

>>> s = df.stack()
>>> s[s].index.tolist()
[(0L, 'a'), (1L, 'a'), (1L, 'b'), (2L, 'b'), (2L, 'c')]

之所以这样做是因为这里的 stack 会返回一个扁平化的版本：

>>> df.stack()
0  a     True
   b    False
   c    False
1  a     True
   b     True
   c    False
2  a    False
   b     True
   c     True
dtype: object

回答于 2025-04-18 由 Python大师

分享举报

从Pandas DataFrame中过滤出真实值，返回(Row,Col)元组

2 个回答

撰写回答