Python删除包含特定字符串的行，而不是所有包含部分word的字符串

#bad_words = ["on", "off"] #sentences = ["Learning Python is an ongoing task", "I practice on and off", "I do it offline", "On weekdays i practice the most", "In weekends I am off"] def clean_sentences(sentences,bad_words, outfile, badfile): bad_words_list = [] with open(bad_words) as wo: bad_words_list=wo.readlines() b_lists=list(map(str.strip, bad_words_list)) for line in b_lists: line=line.strip('\n') line=line.lower() bad_words_list.insert(len(bad_words_list),line) with open(sentences) as oldfile, open(outfile, 'w') as newfile, open(badfile, 'w') as badwords: for line in oldfile: if not any(bad_word in line for bad_word in bad_words): newfile.write(line) else: badwords.write(line) clean_sentences('sentences.txt', 'bad_words.txt', 'outfile.txt', 'badfile.txt')

1条回答

网友

1楼 · 发布于 2024-04-27 00:27:00

与其检查句子中是否有任何不好的单词，不如检查句子的split中是否有任何不好的单词（因此，只有当不好的单词是句子中的单独单词，而不仅仅是它的任意子字符串时，才能得到它们）

这是您的代码的简化版本（没有文件处理）

bad_words = ["on", "off"]
sentences = ["Learning Python is an ongoing task", "I practice on and off", "I do it offline", "On weekdays i practice the most", "In weekends I am off"]

def clean_sentences(sentences, bad_words):
    for sentence in sentences:
        if any(word in map(lambda str: str.lower(), sentence.split()) for word in bad_words):
            print(f'Found bad word in {sentence}')

clean_sentences(sentences, bad_words)

# output
Found bad word in I practice on and off
Found bad word in On weekdays i practice the most
Found bad word in In weekends I am off

关于您自己的代码，只需更新即可

            if not any(bad_word in line for bad_word in bad_words):
                newfile.write(line)

到

            if not any(bad_word in map(lambda str: str.lower(), line.split()) for bad_word in bad_words):
                newfile.write(line)

编辑：为了使搜索不区分大小写，请使用句子中单词的小写版本（假设坏单词本身是小写）。我用一个map和一个简单的lambda函数更新了代码

相关问题更多 >

编程相关推荐

热门问题

热门文章