获取符合条件的Pandas DataFrame的行列索引对
假设我有一个Pandas的DataFrame
,它看起来像下面这样。这些值是基于一个距离矩阵的。
A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
(0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
(0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
(0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
(0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
])
输出:
Out[65]:
0 1 2 3 4
0 1.000000 0.800000 0.670820 0.676123 0.730297
1 0.800000 1.000000 0.670820 0.845154 0.912871
2 0.670820 0.670820 1.000000 0.566947 0.612372
3 0.676123 0.845154 0.566947 1.000000 0.925820
4 0.730297 0.912871 0.612372 0.925820 1.000000
我只想要上三角部分。
c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan
输出:
Out[67]:
0 1 2 3 4
0 NaN 0.8 0.67082 0.676123 0.730297
1 NaN NaN 0.67082 0.845154 0.912871
2 NaN NaN NaN 0.566947 0.612372
3 NaN NaN NaN NaN 0.925820
4 NaN NaN NaN NaN NaN
现在我想根据一些条件获取行和列的索引对。比如:获取值大于0.8的行和列索引。对于这个条件,输出应该是[1,3],[1,4],[3,4]
。有什么帮助吗?
1 个回答
4
你可以使用numpy的 argwhere 函数:
In [11]: np.argwhere(c2 > 0.8)
Out[11]:
array([[1, 3],
[1, 4],
[3, 4]])
如果你想得到索引或列名(而不是它们的整数位置),可以使用列表推导式:
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]