从不良词汇列表创建过滤函数

1 投票

2 回答

4005 浏览

提问于 2025-04-18 13:14

我正在尝试创建一个函数，用来屏蔽字符串中的某些词。这个功能有点效果，但也有一些小问题。

这是我的代码：

def censor(sentence):
    badwords = 'apple orange banana'.split()
    sentence = sentence.split()

    for i in badwords:
        for words in sentence:
            if i in words:
                pos = sentence.index(words)
                sentence.remove(words)
                sentence.insert(pos, '*' * len(i))

    print " ".join(sentence)

sentence = "you are an appletini and apple. new sentence: an orange is a banana. orange test."

censor(sentence)

输出结果是：

you are an ***** and ***** new sentence: an ****** is a ****** ****** test.

有些标点符号消失了，而且单词 "appletini" 被错误地替换了。

这个问题怎么解决呢？

另外，有没有更简单的方法来实现这样的功能呢？

字符串处理文本替换编程问题过滤函数不良词汇

2 个回答

试试这个：

for i in bad_word_list:
    sentence = sentence.replace(i, '*' * len(i))

回答于 2025-04-18 由 Python大师

分享举报

具体的问题有：

你完全没有考虑标点符号；
在插入'*'的时候，你用的是“脏话”的长度，而不是这个词本身。

我建议你调整一下循环的顺序，这样就只需要处理一次句子，并且使用enumerate，而不是用remove和insert：

def censor(sentence):
    badwords = ("test", "word") # consider making this an argument too
    sentence = sentence.split()

    for index, word in enumerate(sentence):
        if any(badword in word for badword in badwords):
            sentence[index] = "".join(['*' if c.isalpha() else c for c in word])

    return " ".join(sentence) # return rather than print

测试str.isalpha只会把大写和小写字母替换成星号。演示：

>>> censor("Censor these testing words, will you? Here's a test-case!")
"Censor these ******* *****, will you? Here's a ****-****!"
            # ^ note length                         ^ note punctuation

回答于 2025-04-18 由 Python大师

分享举报

从不良词汇列表创建过滤函数

2 个回答

撰写回答