除了第六和第八种情况外,我正在尝试匹配以下情况:
case 1 - deliverto should match
case 2 - deliveryto : should match
case 3 - deliveryto: should match
case 4 - delivery to : should match
case 5 - delivery address : should match
case 6 - delivery order : should NOT match
case 7 - ship to: should match
case 8 - delivery inst : should NOT match
case 9 - delivery should match
case 10 - remit to : should match
case 11 - send to: should match
case 12 - remitto: should match
case 13 - delivery: should match
case 14 - deliver: should match
case 15 - delv. : should match
我的逻辑是:匹配第一块[ship
或send
或remit
或deliver
或delivery
或delv.
(点是可选的)]单词,如果第二块[to
或address
]在该块之后找到,或者甚至第二块没有找到,但不要使用第一块[ship
或…]如果在第一块之后找到第三块[order
或inst
]
我对第三个区块使用了一个否定的前瞻,然后对第二个区块使用了一个可选的肯定前瞻。这是我一直在尝试的正则表达式:
pattern = r"(send|remit|ship|delivery|deliver|delv\.?)\s?(?!(Order|inst))(?=(to|address)?)\:?"
我面临的第一个问题是:即使第一个块后跟第三个块,正则表达式也会匹配
第二个问题是:如果可能的案例在一个列表中,并且我尝试re.finditer()
处理它们,那么可选的第二个区块就不匹配了:
l = ['case 1 - deliverto', 'case 2 - deliveryto :', 'case 3 - deliveryto: ', 'case 4 - delivery to :', 'case 5 - delivery address :', 'case 6 - delivery order :', 'case 7 - ship to:', 'case 8 - delivery inst :', 'case 9 - delivery ', 'case 10 - remit to :', 'case 11 - send to:', 'case 12 - remitto:', 'case 13 - delivery: ', 'case 14 - deliver: ', 'case 15 - delv. :']
for i in l:
print([i.group() for i in re.finditer(patern, i, re.IGNORECASE)])
提供:
['deliver']
['delivery']
['delivery']
['delivery ']
['delivery ']
['delivery']
['ship ']
['delivery']
['delivery ']
['remit ']
['send ']
['remit']
['delivery:']
['deliver:']
['delv. :']
我需要匹配可选的to
或address
块(如果找到)。我在正则表达式中做错了什么
有关实现的详细信息,请查看这个regex101站点。谢谢
在找到第一个单词后,需要使正则表达式匹配失败:
参见regex demo
详细信息:
(?i)
-上不区分大小写的匹配(与re.I
相同)\b
-词边界(?!\S+\s+(?:order|inst))
-如果1+个非空白字符、1+个空白字符,则匹配失败,然后order
或inst
立即出现在右侧(?:send|remit|ship|delivery?|delv\.?)
-send
、remit
、ship
、deliver
或交货,
交货or
交货`(?:\s*(?:to|address))?
-可选的0+空格序列,然后是to
或address
\s?
-可选的空白:?
-可选冒号李>相关问题 更多 >
编程相关推荐