匹配多个单词的正则表达式,Python

2 投票
2 回答
10279 浏览
提问于 2025-04-17 20:53

我需要从一个字符串中匹配一个模式。这个字符串是变化的,所以我需要让它有一些灵活性。
我想要做的是提取与“layout”这个词相关的单词,而这些单词有四种不同的形式。

1 word -- layout` eg: hsr layout

2words -- layout eg: golden garden layout

digit-word -- layout eg: 19th layout

digit-word word --layout eg:- 20th garden layout

可以看到,我需要数字字段是可选的。一个正则表达式就能搞定。下面是我写的代码:

import re
p = re.compile(r'(?:\d*)?\w+\s(?:\d*)?\w+l[ayout]*')
text = "opp when the 19th hsr layut towards"
q = re.findall(p,text)

我需要在这个表达式中找到“19th hsr layout”。但是上面的代码没有返回任何结果。我的代码有什么问题呢?

这里有一些字符串示例:

str1 = " 25/4 16th june road ,watertank layout ,blr"  #extract watertank layout 
str2 = " jacob circle 16th rusthumbagh layout , 5th cross" #extract 16th rustumbagh layout
str3 = " oberoi splendor garden blossoms layout , 5th main road"  #extract garden blossoms layout
str4 = " belvedia heights , 15th layout near Jaffrey gym" #extract 15th layout

2 个回答

1

这看起来是有效的 -

import re

l = [" 25/4 16th june road ,watertank layout ,blr",
" jacob circle, 16th rusthumbagh layout , 5th cross",
" oberoi splendor , garden blossoms layout , 5th main road",
" belvedia heights , 15th layout near Jaffrey gym",]

for ll in l:
    print re.search(r'\,([\w\s]+)layout', ll).groups()

输出结果是:

('watertank ',)
(' 16th rusthumbagh ',)
(' garden blossoms ',)
(' 15th ',)
2

使用 r'(?:\w+\s+){1,2}layout' 就像我评论的那样:

>>> import re
>>> p = re.compile(r'(?:\w+\s+){1,2}layout')
>>> p.findall(" 25/4 16th june road ,watertank layout ,blr")
['watertank layout']
>>> p.findall(" jacob circle 16th rusthumbagh layout , 5th cross")
['16th rusthumbagh layout']
>>> p.findall(" oberoi splendor garden blossoms layout , 5th main road")
['garden blossoms layout']
>>> p.findall(" belvedia heights , 15th layout near Jaffrey gym")
['15th layout']

{1,2} 是用来匹配最多两个单词的。

撰写回答