正则表达式匹配字符串中的精确模式

2024-04-24 11:12:45 发布

您现在位置:Python中文网/ 问答频道 /正文

如果我有下面的字符串“some numbers 666666666666666666666 7867866和序列号151283917503423和88888888”并且我想找到15位数字(所以只有151283917503423),我如何使它与更大的数字不匹配,并处理字符串可能只是“151283917503423”,因此我无法识别它可能两边都有空间?在

serial = re.compile('[0-9]{15}')
serial.findall('some numbers 66666666666666666667867866 and serial 151283917503423 and 8888888')

这将返回66666666666666666667866和151283917503423,但我只想要后者


Tags: and字符串reserial空间数字some序列号
3条回答

使用word boundaries

serial = re.compile(r'\b[0-9]{15}\b')

\b Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of alphanumeric or underscore characters, so the end of a word is indicated by whitespace or a non-alphanumeric, non-underscore character. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string, so the precise set of characters deemed to be alphanumeric depends on the values of the UNICODE and LOCALE flags. For example, r'\bfoo\b' matches 'foo', 'foo.', '(foo)', 'bar foo baz' but not 'foobar' or 'foo3'. Inside a character range, \b represents the backspace character, for compatibility with Python’s string literals.

您需要使用单词边界来确保不会在匹配的两边匹配不需要的文本:

>>> serial = re.compile(r'\b\d{15}\b')
>>> serial.findall('some numbers 66666666666666666667867866 and serial 151283917503423 and 8888888')
['151283917503423']

包括单词边界。让s作为您的字符串。你可以用

 >>> re.findall(r'\b\d{15}\b' ,s)
 ['151283917503423']

其中\b断言单词边界(^\w |\w$|\w\w |\w\w\w)

相关问题 更多 >