匹配多个单词的正则表达式,Python
我需要从一个字符串中匹配一个模式。这个字符串是变化的,所以我需要让它有一些灵活性。
我想要做的是提取与“layout”这个词相关的单词,而这些单词有四种不同的形式。
1 word -- layout` eg: hsr layout
2words -- layout eg: golden garden layout
digit-word -- layout eg: 19th layout
digit-word word --layout eg:- 20th garden layout
可以看到,我需要数字字段是可选的。一个正则表达式就能搞定。下面是我写的代码:
import re
p = re.compile(r'(?:\d*)?\w+\s(?:\d*)?\w+l[ayout]*')
text = "opp when the 19th hsr layut towards"
q = re.findall(p,text)
我需要在这个表达式中找到“19th hsr layout”。但是上面的代码没有返回任何结果。我的代码有什么问题呢?
这里有一些字符串示例:
str1 = " 25/4 16th june road ,watertank layout ,blr" #extract watertank layout
str2 = " jacob circle 16th rusthumbagh layout , 5th cross" #extract 16th rustumbagh layout
str3 = " oberoi splendor garden blossoms layout , 5th main road" #extract garden blossoms layout
str4 = " belvedia heights , 15th layout near Jaffrey gym" #extract 15th layout
2 个回答
1
这看起来是有效的 -
import re
l = [" 25/4 16th june road ,watertank layout ,blr",
" jacob circle, 16th rusthumbagh layout , 5th cross",
" oberoi splendor , garden blossoms layout , 5th main road",
" belvedia heights , 15th layout near Jaffrey gym",]
for ll in l:
print re.search(r'\,([\w\s]+)layout', ll).groups()
输出结果是:
('watertank ',)
(' 16th rusthumbagh ',)
(' garden blossoms ',)
(' 15th ',)
2
使用 r'(?:\w+\s+){1,2}layout'
就像我评论的那样:
>>> import re
>>> p = re.compile(r'(?:\w+\s+){1,2}layout')
>>> p.findall(" 25/4 16th june road ,watertank layout ,blr")
['watertank layout']
>>> p.findall(" jacob circle 16th rusthumbagh layout , 5th cross")
['16th rusthumbagh layout']
>>> p.findall(" oberoi splendor garden blossoms layout , 5th main road")
['garden blossoms layout']
>>> p.findall(" belvedia heights , 15th layout near Jaffrey gym")
['15th layout']
{1,2}
是用来匹配最多两个单词的。