如何处理像“\xe7\xbe\x8e”这样的乱码字符串？

网友

1楼 · 编辑于 2024-04-24 23:58:48

您可以使用^{}检查列表中的每个单词是否都是alphanumeric 功能。如果这个单词是字母数字的，那么就保留它，否则就放弃它。这可以通过列表理解来实现

>>> s = ['a','\xe7\xbe\x8e\xe7','b']
>>> [a for a in s if a.isalnum()]
>>> ['a', 'b']

注意：isalnum检查字符串是否为字母数字，即包含字母和/或数字。如果只允许使用字母，请改用^{}

网友
2楼 · 编辑于 2024-04-24 23:58:48

def is_ascii(s): return all(ord(c) < 128 for c in s) s=[e for e in s if is_ascii(e)]
试试这个。它将删除带有非ascii字符的条目（如\xe7\xbe\x8e\xe7）。希望这有帮助！你知道吗

网友
3楼 · 编辑于 2024-04-24 23:58:48

试试这个：

import itertools

s = ['a','\xe7\xbe\x8e\xe7','b']
for i in range(s.count("\xe7\xbe\x8e\xe7")):
    s.remove('\xe7\xbe\x8e\xe7')

然后所有出现的“\xe7\xbe\x8e\xe7”都将从列表中删除。你知道吗