<p>一种可能是,与其直接调用re方法,不如将它们包装在可以理解\u代表它们的转义符的东西中。像这样:</p>
<pre><code>def my_re_search(pattern, s):
return re.search(unicode_unescape(pattern), s)
def unicode_unescape(s):
"""
Turn \uxxxx escapes into actual unicode characters
"""
def unescape_one_match(matchObj):
escape_seq = matchObj.group(0)
return escape_seq.decode('unicode_escape')
return re.sub(r"\\u[0-9a-fA-F]{4}", unescape_one_match, s)
</code></pre>
<p>it工作示例:</p>
^{2}$
<p>感谢<a href="https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python/4020824#4020824">Process escape sequences in a string in Python</a>指出了解码(“unicode_escape”)的想法。在</p>
<p>但请注意,您不能仅仅通过解码(“unicode_escape”)来抛出整个模式。它有时会起作用(因为大多数regex特殊字符在前面加反斜杠时不会改变它们的含义),但一般情况下不起作用。例如,这里使用decode(“unicode_escape”)会改变正则表达式的含义:</p>
<pre><code>pat = r"C:\\.*\u20ac" # U+20ac is the euro sign
>>> print pat
C:\\.*\u20ac # Asks for a literal backslash
pat_revised = pat.decode("unicode_escape")
>>> print pat_revised
C:\.*€ # Asks for a literal period (without a backslash)
</code></pre>