包含单词列表的列的单词分数总和

> print(df['words']) 0 [awww, thats, bummer, shoulda, got, david, car... 1 [upset, that, he, cant, update, his, facebook,... 2 [dived, many, time, ball, managed, save, rest,... 3 [whole, body, feel, itchy, like, it, on, fire] 4 [no, it, not, behaving, at, all, im, mad, why,... 5 [not, whole, crew]

> print(sentiment) abandon -2 0 abandoned -2 1 abandons -2 2 abducted -2 3 abduction -2 4 abductions -2 5 abhor -3 6 abhorred -3 7 abhorrent -3 8 abhors -3 9 abilities 2 ...

2条回答

网友

1楼 · 编辑于 2024-05-23 20:19:10

如果将分数作为一个系列，以单词作为标签：

In [11]: s  # e.g. sentiment.set_index("word")["score"]
Out[11]:
abandon     -2
abandoned   -2
abandons    -2
abducted    -2
abduction   -2
Name: score, dtype: int64

然后您可以查看列表的分数：

In [12]: s.loc[["abandon", "abducted"]].sum()
Out[12]: -4

所以应用的方法是：

df['words'].apply(lambda ls: s.loc[ls])

如果需要支持缺少的单词（不在s中），可以使用reindex:

In [21]: s.reindex(["abandon", "abducted", "missing_word"]).sum()
Out[21]: -4.0

df['words'].apply(lambda ls: s.reindex(ls))

网友

2楼 · 编辑于 2024-05-23 20:19:10

如果第二列的字符串中有值，则需要首先通过转换过滤数据一栏变成两栏

df['Sentiment'],df['Sentiment_value']=df.sentiment.str.split(" ")

然后从情绪栏中找到情绪指数，从情绪值栏中得到值

相关问题更多 >

编程相关推荐

热门问题

热门文章