将文本文件中重复的单词写入另一个文本文件
我正在尝试写一个函数,它可以读取一个文本文件,然后把重复的单词只输入一次到另一个文本文件里。问题是,这段代码只输出了最后一行的重复单词。
import string
def repeat_words(in_file, out_file):
file_in = open(in_file, "r")
file_out = open(out_file, "w")
for lines in file_in:
words = lines.lower().split()
seen_words = []
repeat_words = []
for word in words:
word = word.strip(string.punctuation)
if word in seen_words and word not in repeat_words:
repeat_words.append(word)
seen_words.append(word)
file_out.write(' '.join(repeat_words) + '\n')
file_in.close()
file_out.close()
inF = "nnn.txt"
OutF = "Untitled-1.txt"
repeat_words(inF, OutF)
1 个回答
0
我想写一个函数,它可以读取一个文本文件,然后把重复的单词只输入一次到另一个文本文件里。
你的代码有点不太清楚。
如果你想从文件A中提取重复的单词,并把它们写入文件B,可以试试下面的代码。
import string
def repeat_words(in_file, out_file):
with open(in_file, "r") as file_in, open(out_file, "w") as file_out :
seen_words = set()
repeated_words = []
for line in file_in.readlines():
words = line.split()
for word in words:
word = word.lower().strip(string.punctuation)
if word in seen_words and word not in repeat_words:
repeated_words.append(word)
seen_words.add(word)
file_out.write(' '.join(repeat_words))
inF = "nnn.txt"
OutF = "Untitled-1.txt"
repeat_words(inF, OutF)