我有一份用户调查文件:
Score Comment
8 Rapid bureaucratic affairs. Reports for policy...
4 There needs to be communication or feed back f...
7 service is satisfactory
5 Good
5 There is no
10 My main reason for the product is competition ...
9 Because I have not received the results. And m...
5 no reason
我想确定哪些关键字对应较高的分数,哪些关键字对应较低的分数。在
我的想法是构建一个单词表(或者,一个“单词向量”字典),其中包含与之相关的分数,以及该分数与该句子关联的次数。在
如下所示:
^{pr2}$然后,对于每个单词,平均分数是与该单词相关联的所有分数的平均值。在
为此,我的代码如下:
word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs
for i in range(len(data)):
sentence = data['SurveyResponse'][i].split(' ')
for word in sentence:
word_vec['word'] = word
if word in word_vec:
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
else:
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}
但这段代码给了我以下错误:
File "<ipython-input-144-14b3edc8cbd4>", line 9
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
^
SyntaxError: invalid syntax
有人能告诉我正确的方法吗?在
你可以用收款台。它允许计算每个单词出现的次数。在
这里有一个例子:
结果:
^{pr2}$从文档中提取单词并将它们放入一个列表中。最后,计数器将处理该列表,以计算每个单词的出现次数。在
试试这段代码
要增加'NumberOfTimes'的值,您可以像这样直接递增
word_vec[word]['NumberOfTimes'] += 1
相关问题 更多 >
编程相关推荐