Python中两个重复关键字之间的子串获取方法

2024-04-23 23:08:43 发布

您现在位置:Python中文网/ 问答频道 /正文

表示字符串:

 string = 'Other unwanted text here and start here: This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'

我想摘录前三句话

This is the first sentence.\nIt is the second one.\nNow, this is the third one.

显然,以下正则表达式不起作用:

re.search('(?<=This)(.*?)(?=\n)', string)

This和第三个\n之间提取文本的正确表达式是什么?你知道吗

谢谢。你知道吗


Tags: the字符串stringhereisthisonesentence
3条回答

(?s)(This.*?)(?=\nThis)

使.包含换行符为(?s),查找以This开头,后跟\nThis的序列。你知道吗

别忘了,搜索结果的__repr__不会打印整个匹配的字符串,所以您需要

print(re.search('(?s)(This.*?)(?=\nThis)', string)[0])

Jerry是对的,regex不是合适的工具,有更简单更有效的方法来解决这个问题

this = 'This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'

print('\n'.join(this.split('\n', 3)[:-1]))

输出:

This is the first sentence.

It is the second one.

Now, this is the third one.

如果您只是想练习使用regex,那么遵循教程会容易得多。你知道吗

您可以使用这个正则表达式捕获以This文本开头的三个句子

This(?:[^\n]*\n){3}

Demo

编辑:

Python代码

import re

s = 'Other unwanted text here and start here: This is the first sentence.\nIt is the second one.\nNow, this is the third one.\nThis is not I want.\n'

m = re.search(r'This(?:[^\n]*\n){3}',s)
if (m):
 print(m.group())

指纹

This is the first sentence.
It is the second one.
Now, this is the third one.

相关问题 更多 >