正则表达式:标记模式

2024-06-08 15:24:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图标记一个包含“manu”的句子,从它最近的\n\n到它最近的\n\n, 这是文本

\n\nHolds Certificate No: EMS 96453\nand operates an Environmental Management System which complies with the requirements of ISO for\n\nthe following scope:The Environmental Management System of Dow Corning, for management of environmental\nrisks associated with all global business processes for the marketing, developing,\n   manufacturing, and supply of silicon-based and complementary products and services.\n\n/ tou\n\nFor and on behalf\n\n

我只想记下这个

the following scope:The Environmental Management System of Dow Corning, for management of environmental\nrisks associated with all global business processes for the marketing, developing,\n   manufacturing, and supply of silicon-based and complementary products and services.

我试过这个正则表达式

\\n\\n(.+manu.+?)\\n\\n

但是它忽略了与我的模式最近的\n\n,并且标记了比我想要的多得多的文本

Holds Certificate No: EMS 96453\nand operates an Environmental Management System which complies with the requirements of ISO for\n\nthe following scope:The Environmental Management System of Dow Corning, for management of environmental\nrisks associated with all global business processes for the marketing, developing,\n   manufacturing, and supply of silicon-based and complementary products and services.

我错过了什么


Tags: andoftheforwithsystemmanagementfollowing
1条回答
网友
1楼 · 发布于 2024-06-08 15:24:24

模式从左侧开始,首先匹配\\n\\n,然后使用匹配任何字符的点。因此,在这种情况下,它将匹配manu,而不考虑中间的任何字符

您可以使用模式来匹配\\n\\n,并确保在遇到manu之前不再匹配它

然后进行匹配,直到它后面第一次出现\\n\\n,然后在捕获组中捕获所需的部分

\\n\\n((?:(?!\\n\\n).)+manu.+?)\\n\\n

解释

  • \\n\\n逐字匹配
  • (捕获组1
    • (?:(?!\\n\\n).)+匹配任何断言右侧内容的字符都不是\\n\\n
    • manu.+?匹配manu后跟尽可能少的字符
  • )关闭组1
  • \\n\\n逐字匹配

Regex demo

如果还希望在后面跟\\n\\n或字符串结尾时匹配:

\\n\\n((?:(?!\\n\\n).)+manu.+?)(?:\n\\n|$)

Regex demo

相关问题 更多 >