用劣质符号替换字符串的问题

replacements = {'toot': 'titi-', '<75%': 'NONE'} def replace(match): return replacements[match.group(0)] def clean75Case(text_page): return re.sub('|'.join(r'\b%s\b' % re.escape(s) for s in replacements), replace, text_page) if __name__ == '__main__': print(clean75Case("toot iiii <75%"))

1条回答

网友

1楼 · 发布于 2024-04-27 13:34:04

正如评论中提到的，问题是\b只匹配单词和非单词字符之间的边界。从the docs：

\b
Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of word characters. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string

在您给出的字符串中，空格字符（) and the less than character (<）都是非单词字符。因此\b与它们之间的空格不匹配

对于解决此问题的另一种方法，请考虑使用split()将字符串拆分为单词，并将每个单词与替换模式进行比较，如下所示：

replacements = {'toot': 'titi-',
                '<75%': 'NONE'}

def clean75Case(text_page):
    words = text_page.split()
    return ' '.join(replacements.get(w, w) for w in words)

if __name__ == '__main__':
    print(clean75Case("toot iiii <75%"))

`相关问题更多 >`

`编程相关推荐`

`热门问题`

`热门文章`

用劣质符号替换字符串的问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

`相关问题更多 >`

`编程相关推荐`

`热门问题`

`热门文章`