使用空格的句子情感得分

import spacy from spacy.matcher import Matcher matcher = Matcher(nlp.vocab) def set_sentiment(matcher, doc, i, matches): doc.sentiment += 0.1 myemotionalwordlist = ['you','superb','great','free'] sentence0 = 'You are a superb great free person' sentence1 = 'You are a great person' sentence2 = 'Rocks are made o minerals' sentences = [sentence0,sentence1,sentence2] pattern2 = [[{"ORTH": emotionalword, "OP": "+"}] for emotionalword in myemotionalwordlist] matcher.add("Emotional", set_sentiment, *pattern2) # Match one or more emotional word for sentence in sentences: doc = nlp(sentence) matches = matcher(doc) for match_id, start, end in matches: string_id = nlp.vocab.strings[match_id] span = doc[start:end] print("Sentiment", doc.sentiment)

1条回答

网友

1楼 · 发布于 2024-05-19 00:41:40

主要有两种方法：

你已经开始了，这是一个情感词汇的列表，并计算它们出现的频率

第一种方法会变得更好，因为你给它更多的话，但你最终会达到一个极限。（仅仅是由于人类语言的模糊性和灵活性，例如，虽然“you”比“it”更具情感性，但会有很多使用“you”的非情感性句子。）

any suggestions on how I can extract emotional words from wordnet?

看看sentiwordnet，它为每个wordnet条目添加了积极性、消极性或中立性的度量。对于“情绪化”，您可以只提取pos或neg分数超过0.5的部分。（请注意非商业专用许可证。）

如果可以提供足够的训练数据，那么第二种方法可能会工作得更好，但是“足够”有时可能太多。其他缺点是，这些模型通常需要更多的计算能力和内存（如果你需要离线或在移动设备上工作，这是一个严重的问题），而且它们是一个黑盒

我认为2020年的方法是从预先训练好的伯特模型开始（越大越好，请参见the recent GPT-3 paper），然后用手动注释的100K句子样本对其进行微调。在另一个示例上对其进行评估，并为错误的示例注释更多的训练数据。继续这样做，直到达到所需的精度水平

（顺便说一句，Spacy对这两种方法都有支持。我上面所说的微调也被称为转移学习。参见https://spacy.io/usage/training#transfer-learning谷歌搜索“Spacy情绪分析”会找到很多教程。）

相关问题更多 >

编程相关推荐

热门问题

热门文章