如何将正则表达式与模式和任意次数相匹配?

2024-04-19 10:03:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个正则表达式,定义如下:

>>> import re
>>> regex = re.compile("(\d+:)+(\d+)")
>>> search_results = regex.search("52345:54325432:555:443:3:33")
>>> search_results.groups()
('3:', '33')

我知道我能做到

>>> "52345:54325432:555:443:3:33".split(":")

我想知道如何使用regex实现这一点。你知道吗


Tags: importresearch定义resultsregexgroupssplit
3条回答

使用re.findall如果您想要所有匹配项,re.search在第一个匹配项处停止:

>>> strs = "52345:54325432:555:443:3:33"
>>> re.findall(r"(\d+):(\d+)",strs)
[('52345', '54325432'), ('555', '443'), ('3', '33')]

如果希望得到与str.split完全相同的结果,则可以执行以下操作:

>>> re.split(r":",strs)
['52345', '54325432', '555', '443', '3', '33']
>>> re.findall(r"[^:]+",strs)
['52345', '54325432', '555', '443', '3', '33']

您应该使用split来解决这个问题。你知道吗

findall可以处理任何有效的字符串。不幸的是,它也适用于任何无效的字符串。如果这是你想要的,好吧;但你可能想知道是否有错误。你知道吗

示例:

>>> import re
>>> digits = re.compile("\d+")
>>> digits.findall("52345:54325432:555:443:3:33")
['52345', '54325432', '555', '443', '3', '33']
>>> digits.findall("52345:54325.432:555:443:3:33")
['52345', '54325', '432', '555', '443', '3', '33']
>>> digits.findall(""There are 2 numbers and 53 characters in this string."")
['2', '53']

当然,如果您决定只使用re模块,您可以先匹配然后分割:

>>> valid = re.compile("(?:\d+:)*\d+$")
>>> digits = re.compile("\d+")
>>> s = "52345:54325432:555:443:3:33"
>>> digits.findall(s) if valid.match(s) else []

相比之下:

>>> [int(n) for n in "52345:54325432:555:443:3:33".split(":")]
[52345, 54325432, 555, 443, 3, 33]
>>> [int(n) for n in "52345:54325.432:555:443:3:33".split(":")]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '54325.432'

>>> [int(n)
...  for n in "There are 2 numbers and 53 characters in this string.".split(":")]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10:
  'There are 2 numbers and 53 characters in this string.'

看看这是否有用。。。你知道吗

>>> pat = r'(\d+(?=\:)|\d+$)'
>>> regexp = re.compile(pat)
>>> m = regexp.findall("52345:54325432:555:443:3:33")
>>> m
['52345', '54325432', '555', '443', '3', '33']
>>>

相关问题 更多 >