我正在尝试用python编写一个程序,它应该从示例.txt发生3次以上。忽略单词和特殊字符的大小写,输出应按频率排序。在
我的老师告诉我,我只需要改两行字!但是我把Python弄瞎了。对我来说,代码看起来是正确的,但它确实有效。在
with open("examen.txt") as f:
data = f.read()
text = data.replace("\xad", "")
words = []
for word in data.lower().split():
word = word.strip("‚‘!,.:«»-()'_#-–„“*?")
if word != "":
if not word[-1].isalnum():
print(repr(word))
words.append(word)
trigrams = {}
for i in range(len(words)):
word = words[i]
nextword = words[i + 1]
nextnextword = words[i + 2]
key = (word, nextword, nextnextword)
trigrams[key] = trigrams.get(key, 0) + 1
l = list(trigrams.items())
l.sort(key=lambda x: (x[1], x[0]))
l.reverse()
for key, count in trigrams:
if count < 3:
break
word = key[0]
nextword = key[1]
nextnextword = key[2]
print(word, nextword, nextnextword, count)
当您构建三元组时,您遍历到
words
太深,而在最后一个循环中没有打印出正确的数据结构。在换两行我会写-
相关问题 更多 >
编程相关推荐