基于单独数组中标签的numpy数组值总和

2024-06-09 19:32:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我有类似于以下的数组:

a=[["tennis","tennis","golf","federer","cricket"],
   ["federer","nadal","woods","sausage","federer"],
   ["sausage","lion","prawn","prawn","sausage"]]

然后我有一个由以下权重组成的矩阵

^{pr2}$

然后我要做的是根据矩阵a的标签对每一行的权重求和,然后从该行中取前3个标签。最后我想要这样的东西:

res=[["cricket","tennis","federer"],
     ["federer","sausage","nadal"],
     ["lion","sausage","prawn"]]

在我的实际数据集中,关联是极不可能的,也不是一个真正的问题,比如说整个行是:

["federer","federer","federer","federer","federer"]

理想情况下,我想把这个作为 [“费德勒”,“”,“”]。在

如有任何指导,我们将不胜感激。在


Tags: res矩阵标签数组权重sausagecricketlion
3条回答

我使用以下代码成功地使其正常工作:

def myf(a,w):

    lookupTable, indexed_dataSet = np.unique(a, return_inverse=True)
    y= np.bincount(indexed_dataSet,w)
    lookupTable[y.argsort()]
    res=(lookupTable[y.argsort()][::-1][:3])
    ret=np.empty((3))
    ret.fill(res[-1])
    ret[0:res.shape[0]]=res
    return ret

result = np.empty_like(knearest_labels[:,0:3])
for i,(x,y) in enumerate(zip(a,w)):
    result[i] = myf(x,y)

尝试:

print pd.DataFrame(
    {i: a.loc[i, row.sort_values(ascending=False).index[:3]].values for i, row in w.iterrows()}
).T

         0        1      2
0  cricket  federer   golf
1  federer  sausage  nadal
2     lion  sausage  prawn

有关numpy数组,请参见piRSquared answer。在

这是一种纯python方法:

for i in range(4):
    if a[i].count(a[i][0]) == len(a[i]):
        res = [a[1][0], "", ""]
    else:
        res = [x[0] for x in sorted(zip(a[i], w[i]), key=lambda c: c[1], reverse=True)[:3]]

    print(res)

相关问题 更多 >