Python正则表达式查找

网友

1楼 · 编辑于 2024-05-29 02:35:08

import re
regex = ur"\[P\] (.+?) \[/P\]+?"
line = "President [P] Barack Obama [/P] met Microsoft founder [P] Bill Gates [/P], yesterday."
person = re.findall(regex, line)
print(person)

收益率

['Barack Obama', 'Bill Gates']

regexur"[\u005B1P\u005D.+?\u005B\u002FP\u005D]+?"完全相同 unicode为u'[[1P].+?[/P]]+?'，但较难读取。

第一个括号中的组[[1P]告诉re列表中的任何字符['[', '1', 'P']都应该匹配，类似于第二个括号中的组[/P]]。这根本不是您想要的。所以

拆下外部封闭方括号。（同时移除在P前面的杂散1。）
要保护[P]中的文字括号，请使用反斜杠：\[P\]。
要只返回标记内的单词，请放置分组括号大约.+?。

网友

2楼 · 编辑于 2024-05-29 02:35:08

您的问题不是100%清楚，但我假设您希望找到标签中的每一段文本：

>>> import re
>>> line = "President [P] Barack Obama [/P] met Microsoft founder [P] Bill Gates [/P], yesterday."
>>> re.findall('\[P\]\s?(.+?)\s?\[\/P\]', line)
['Barack Obama', 'Bill Gates']

网友

3楼 · 编辑于 2024-05-29 02:35:08

试试这个：

   for match in re.finditer(r"\[P[^\]]*\](.*?)\[/P\]", subject):
        # match start: match.start()
        # match end (exclusive): match.end()
        # matched text: match.group()

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python正则表达式查找

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >