在python中获得比预期更多的匹配

2024-06-10 03:32:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着用python做一些模式匹配。但我不明白为什么我只为一场比赛而得到第二场比赛。你知道吗

import re

def Main():
    m = "12312312ranger12312319"
    pattern = re.compile('(\d$)')
    r = pattern.search(m)
    if r:
        print "Matched " + r.group(0) +  " Second " + r.group(1)
    else:
        print "Not Matched"

if __name__ == '__main__':
    Main()

这给了我这样的输出

Matched 9 Second 9

我认为r.group(1)根本不应该在那里。我理解错了吗?你知道吗


Tags: nameimportresearchifmaindefgroup
3条回答

因为$符号,你匹配字符串的结尾!同样地,9是第一个和整个匹配的模式,group(0)(整个匹配)和group(1)(第一个带圆括号的子组)都返回9。你知道吗

Regular expression visualization

Debuggex Demo

现在如果您不想要group(1),您需要从模式中删除分组并使用r'\d$',但是请注意$匹配最后一个字符9。你知道吗

从wiki:

group() Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the corresponding parenthesized group. If a group number is negative or larger than the number of groups defined in the pattern, an IndexError exception is raised. If a group is contained in a part of the pattern that did not match, the corresponding result is None. If a group is contained in a part of the pattern that matched multiple times, the last match is returned.

示例:

>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m.group(0)       # The entire match
'Isaac Newton'
>>> m.group(1)       # The first parenthesized subgroup.
'Isaac'
>>> m.group(2)       # The second parenthesized subgroup.
'Newton'
>>> m.group(1, 2)    # Multiple arguments give us a tuple.
('Isaac', 'Newton')

组(0)将始终返回匹配的整个文本,无论是否在组中捕获。参见示例:

import re

def Main():
    m = "12312312ranger12312319"
    pattern = re.compile('\d(\d$)')
    r = pattern.search(m)
    if r:
        print r.group(0) + ' ' + r.group(1)
    else:
        print "Not Matched"

if __name__ == '__main__':
    Main()

输出:

19 9

因为你在匹配和捕捉一行末尾的最后一个数字。所以组(0)和组(1)指的是相同的。(\d$)不仅可以捕获,还可以进行匹配。最后group(0)打印匹配的字符,group(1)打印捕获的组索引1中存在的所有字符。你知道吗

相关问题 更多 >