清理Python正则表达式

网友

1楼 · 编辑于 2024-05-14 00:25:44

可以使用verbose模式编写可读性更强的正则表达式。在此模式下：

模式中的空白将被忽略，除非在字符类中或前面有未经转义的反斜杠。在
当一行既不包含字符类中的“#”，也不包含不带转义反斜杠的行，则忽略从最左边这样的“#”到行尾的所有字符。在

以下两种说法相当：

a = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)

b = re.compile(r"\d+\.\d*")

（摘自verbose mode文档）

网友

2楼 · 编辑于 2024-05-14 00:25:44

您可以在regex中使用注释，这使它们更具可读性。以http://gnosis.cx/publish/programming/regular_expressions.html为例：

/               # identify URLs within a text file
          [^="] # do not match URLs in IMG tags like:
                # <img src="http://mysite.com/mypic.png">
http|ftp|gopher # make sure we find a resource type
          :\/\/ # ...needs to be followed by colon-slash-slash
      [^ \n\r]+ # stuff other than space, newline, tab is in URL
    (?=[\s\.,]) # assert: followed by whitespace/period/comma 
/

网友

3楼 · 编辑于 2024-05-14 00:25:44

虽然@Ayman关于re.VERBOSE的建议是一个更好的主意，但是如果你只想展示你所展示的内容，那么就做：

patterns = re.compile(
        r'<! ([^->]|(-+[^->])|(-?>))*-{2,}>'
        r'\n+|\s{2}'
)

而Python的相邻字符串文本的自动连接（很像C，btw）将完成其余的工作；-）。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

清理Python正则表达式

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >