重复详细信息前瞻性断言

2024-04-25 00:21:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个详细的(re.X)标记的regex,它抛出了一个异常,尽管它看起来与它的压缩版本是等价的。(我用后者建造了前者。)

浓缩版:

import re
test = 'catdog'
test2 = 'dogcat'
pat = re.compile(r'(?=\b\w{6}\b)\b\w*cat\w*\b')

print(pat.search(test))
print(pat.search(test2))
# catdog Match object
# dogcat Match object

详细版本:

pat = re.compile(r"""(               # Start of group (lookahead); need raw string
                     ?=              # Positive lookahead; notation = `q(?=u)`
                     \b\w{6}\b       # Word boundary and 6 alphanumeric characters
                     )               # End of group (lookahead)
                     \b\w*cat\w*\b   # Literal 'cat' in between 0 or more alphanumeric""", re.X)
print(pat.search(test).string)
print(pat.search(test2).string)

# Throws exception
# error: nothing to repeat at position 83 (line 2, column 22)

是什么原因造成的?我找不到为什么扩展版本违反了re.X/re.VERBOSE的任何条件。来自文档:

This flag allows you to write regular expressions that look nicer and are more readable by allowing you to visually separate logical sections of the pattern and add comments. Whitespace within the pattern is ignored, except when in a character class or when preceded by an unescaped backslash. When a line contains a # that is not in a character class and is not preceded by an unescaped backslash, all characters from the leftmost such # through the end of the line are ignored.

据我所知,没有字符类或空格前面有未转义的反斜杠。你知道吗


Tags: andofthetointest版本re
2条回答

问题是第二行的?=?可以表示像[ ]?这样的多个事物,这是0或1个空格,我相信它前面的空格就是这样。空白被忽略,但它使两个字符(?成为分离的实体。你知道吗

?=移到第1行,它就可以工作了。就像(?=

错误

error: nothing to repeat at position 83

很明显,?在这里被解释为重复

这是Python issue 15606re在详细模式下标记中包含空格的行为与文档不匹配。不能将空格放在(?=的中间。你知道吗

相关问题 更多 >