匹配一个单词但仅当另一个单词未出现时的正则表达式?

2024-05-19 01:39:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我通常对正则表达式很在行,但我正在努力解决这个问题。我需要一个与术语cbd匹配的正则表达式,但如果短语central business district出现在搜索字符串的任何位置中,则不需要。或者,如果这太难,至少匹配cbd,如果短语central business district没有出现在术语cbd之前的任何地方。结果应该只返回cbd部分,因此我使用lookaheads/lookbehinds,但我无法满足要求

输入示例:
好的 Any products containing CBD are to be regulated.
坏的Properties located within the Central Business District (CBD) are to be regulated

我试过:

  • (?!central business district)cbd
  • (.*(?!central business district).*)cbd

这是在Python3.6+中使用re模块实现的

我知道用几行代码就可以很容易地完成,但是我们在数据库中有一个正则表达式字符串列表,我们用它来搜索语料库中包含数据库中任意一个正则表达式字符串的文档。最好避免将任何关键字硬编码到脚本中,因为这样我们的其他开发人员就不清楚这些匹配的来源,因为他们在数据库中看不到


Tags: to字符串数据库地方bebusinessare术语
1条回答
网友
1楼 · 发布于 2024-05-19 01:39:31

将PyPi正则表达式与

import regex
strings = [' I need a regular expression that matches the term cbd but not if the phrase central business district appears anywhere else in the search string.', 'I need cbd here.']
for s in strings:
  x = regex.search(r'(?<!central business district.*)cbd(?!.*central business district)', s, regex.S)
  if x:
    print(s, x.group(), sep=" => ")

结果:I need cbd here. => cbd。见Python code

解释

                                        
  (?<!                     look behind to see if there is not:
                                        
    central business         'central business district'
    district
                                        
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
                                        
  )                        end of look-behind
                                        
  cbd                      'cbd'
                                        
  (?!                      look ahead to see if there is not:
                                        
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
                                        
    central business         'central business district'
    district
                                        
  )                        end of look-ahead

相关问题 更多 >

    热门问题