如何在Python中获取/使用文本fi的词频计数器值

2024-04-27 03:39:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我是Python的新手,在计算文本文件中的单词以及用它们做其他事情的代码方面遇到了一些问题。我想做的是:

我想打印从文本文件创建的计数器/字典中作为输入/参数在函数中给出的确切单词的频率。我想把单词频率的值加起来。例如,如果“apple”,2和“banana”,5,那么我想得到值7。而且,当我倒进泡菜的时候,我得到了一个错误

到目前为止,我试过这个,但有很多错误。任何帮助都将不胜感激。你知道吗

import os
import pickle
from collections import Counter

os.chdir("E:/")


def wfrequencies(word, text):
    f1 = open('words.txt', 'r')
    # Dictionary to find Unique words and their frequency
    message1 = f1.read()

    # count all word frequencies
    c = Counter(message1.split()).items()
    print(c)

    #count the total number of words
    f = sum(c.values())
    print(f)

    #only print the value of the given word arugument. This case 'or' as the arguement
    p_word_count = word

    #Return the value of dictionary for the passed argument
    c["p_word_count"]

#dump value of dictionary item in pickle file
    pickle.dump(c, open("my_out.dat", "wb"))

这就是错误:

Traceback (most recent call last):
dict_items([('EGG-PLANTS', 1), ('IN', 1), ('THE', 1), ('OVEN', 1), ('(Melanzane', 1), ('al', 1), ('forno)', 1), ('Skin', 1), ('five', 1), ('or', 2), ('six', 1), ('egg-plants,', 1), ('cut', 1), ('them', 2), ('in', 4), ('round', 1), ('slices', 2), ('and', 6), ('salt', 1), ('so', 1), ('that', 2), ('they', 2), ('throw', 1), ('out', 1), ('the', 6), ('water', 1), ('contain.', 1), ('After', 1), ('a', 5), ('few', 1), ('hours', 1), ('dip', 1), ('flour', 1), ('frying', 1), ('oil.', 1), ('Take', 1), ('fireproof', 1), ('vase', 2), ('baking', 1), ('tin', 1), ('place', 1), ('layers,', 1), ('with', 4), ('grated', 2), ('cheese', 2), ('between', 1), ('each', 1), ('layer,', 1), ('abundantly', 1), ('seasoned', 1), ('tomato', 2), ('sauce', 1), ('(No.', 1), ('12).', 1), ('Beat', 1), ('one', 1), ('egg', 2), ('pinch', 1), ('of', 5), ('salt,', 1), ('tablespoonful', 1), ('sauce,', 1), ('teaspoonful', 1), ('two', 1), ('crumbs', 1), ('bread,', 1), ('cover', 1), ('upper', 1), ('layer', 1), ('this', 1), ('sauce.', 1), ('Put', 1), ('oven', 1), ('when', 1), ('is', 1), ('coagulated,', 1), ('serve', 1), ('hot.', 1)])

File "C:/Users/mn-ra/PycharmProjects/untitled/NLP.py", line 49, in <module>
wfrequencies("or", "recipe_Ital_102")

File "C:/Users/mn-ra/PycharmProjects/untitled/NLP.py", line 19, in wfrequencies
f = sum(c.values())
AttributeError: 'dict_items' object has no attribute 'values'

Tags: oftheinimportvaluecount错误items
1条回答
网友
1楼 · 发布于 2024-04-27 03:39:00

我认为p_word_count = word是无用的,而c["p_word_count"]是错误的根源,请尝试c[p_word_count](或c[word])。你知道吗

请把错误贴出来。你知道吗

编辑:根据错误,一个Counter没有“values()”方法,您需要迭代(word,freq)元组。你知道吗

相关问题 更多 >