用于捕获字符串的一部分的正则表达式

2条回答

网友

1楼 · 编辑于 2024-04-19 23:20:12

得到这个结果是因为使用了header.string，它调用了Match object上的.string，这将返回传递给match()或搜索()的字符串。你知道吗

字符串中已包含新行：

text = r"# Title\n## Chapter\n### sub-chapter#### What a lovely day.\n"

因此，如果您使用您的模式（请注意，它也将匹配换行符），您可以将代码更新为：

import re

pattern = r"(# .+?\\n)"
text = r"# Title\n## Chapter\n### sub-chapter#### What a lovely day.\n"
header = re.search(pattern, text)
print(header.group())

Python demo

注意，re.search查找正则表达式产生匹配的第一个位置。你知道吗

另一个匹配您的值的选项可以是从字符串的开始处匹配#，后跟空格，然后是除换行符以外的任何字符，直到字符串的结尾：

^# .*$

例如：

import re

pattern = r"^# .*$"
text = "# Title\n## Chapter\n### sub-chapter#### What a lovely day.\n"
header = re.search(pattern, text, re.M)
print(header.group())

Python demo

如果后面不能再出现#，也可以使用negated character class来匹配#或换行符：

^# [^#\n\r]+$

网友

2楼 · 编辑于 2024-04-19 23:20:12

我猜我们希望提取# Title\n，在这种情况下，您的表达式似乎工作正常，只需稍加修改：

(# .+?\\n)(.+)

DEMO

测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(# .+?\\n)(.+)"

test_str = "# Title\\n## Chapter\\n### sub-chapter#### The Bar\\nIt was a fall day.\\n"

subst = "\\1"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 1)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

测试

相关问题更多 >

编程相关推荐

热门问题

热门文章

用于捕获字符串的一部分的正则表达式

测试

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >