我有一个txt文件。我写了一些代码,可以找到唯一的单词和每个单词在该文件中出现的次数。我现在需要把这些字打印出来。我该怎么做呢?在
Here is a sample output: Analyze what file: itsy_bitsy_spider.txt
Concordance for file itsy_bitsy_spider.txt itsy : Total Count: 2 Line:1: The ITSY Bitsy spider crawled up the water spout Line:4: and the ITSY Bitsy spider went up the spout again
#this function will get just the unique words without the stop words.
def openFiles(openFile):
for i in openFile:
i = i.strip()
linelist.append(i)
b = i.lower()
thislist = b.split()
for a in thislist:
if a in stopwords:
continue
else:
wordlist.append(a)
#print wordlist
#this dictionary is used to count the number of times each stop
countdict = {}
def countWords(this_list):
for word in this_list:
depunct = word.strip(punctuation)
if depunct in countdict:
countdict[depunct] += 1
else:
countdict[depunct] = 1
如果逐行分析输入文本文件,则可以维护另一个字典,即word->;List<;line>;映射。一行中的每一个字都要加一个词条。可能看起来像下面这样。请记住,我对python不是很熟悉,所以我可能错过了一些语法捷径。在
例如
您可能需要做的一个修改是,如果一个单词出现在行中两次,则防止向linedict添加重复项。在
上面的代码假设您只想读取一次文本文件。在
相关问题 更多 >
编程相关推荐