用于重复字符串的python正则表达式

网友

1楼 · 编辑于 2024-05-13 18:42:40

您可以使用标准的字符串工具，这些工具通常更具可读性。

s = "start: c12354, c3456, 34526;"

s.startswith("start:") # returns a boolean if it starts with this string

s.endswith(";") # returns a boolean if it ends with this string

s[6:-1].split(', ') # will give you a list of tokens separated by the string ", "

网友

2楼 · 编辑于 2024-05-13 18:42:40

在Python中，这在一个正则表达式中是不可能的：组的每个捕获都会覆盖同一组的最后一个捕获（在.NET中，这实际上是可能的，因为引擎区分了捕获和组）。

最简单的解决方案是先提取start:和;之间的部分，然后使用正则表达式返回所有的匹配项，而不仅仅是单个匹配项，使用^{}。

网友

3楼 · 编辑于 2024-05-13 18:42:40

这可以用Pyparsing这样的工具来完成（相当优雅）：

from pyparsing import Group, Literal, Optional, Word
import string

code = Group(Optional(Literal("c"), default='') + Word(string.digits) + Optional(Literal(","), default=''))
parser = Literal("start:") + OneOrMore(code) + Literal(";")
# Read lines from file:
with open('lines.txt', 'r') as f:
    for line in f:
        try:
            result = parser.parseString(line)
            codes = [c[1] for c in result[1:-1]]
            # Do something with teh codez...
        except ParseException exc:
            # Oh noes: string doesn't match!
            continue

比正则表达式更干净，返回代码列表（不需要string.split），并忽略行中的任何额外字符，就像您的示例一样。

相关问题更多 >

编程相关推荐

热门问题

热门文章