在字符串中多次匹配组

2024-04-29 04:49:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用正则表达式。我有一个字符串需要匹配

 influences = 
 {{hlist |[[Plato]] |[[Aristotle]] |[[Socrates]] |[[David Hume]] |[[Adam Smith]] |[[Cicero]] |[[John Locke]]}}
 {{hlist |[[Saint Augustine]] |[[Saint Thomas Aquinas]] |[[Saint Thomas More]] |[[Richard Hooker]] |[[Edward Coke]]}}
 {{hlist |[[Thomas Hobbes]] |[[Rene Descartes]] |[[Montesquieu]] |[[Joshua Reynolds]] |[[Sir William Blackstone|William Blackstone]]}}
 {{hlist |[[Niccolo Machiavelli]] |[[Dante Alighieri]] |[[Samuel Johnson]] |[[Voltaire]] |[[Jean Jacques Rousseau]] |[[Jeremy Bentham]]}}

我想从文本中提取以下模板:

{{hlist .... }}

相反,以下文本不必匹配:

main_interests = 
 {{hlist |[[Music]] |[[Art]] |[[Theatre]] |[[Literature]]}}

我写了这个正则表达式,但不起作用

(?:^\|\s*)?(?:influences)\s*?=\s*?(?:(?:\s*\{\{hlist)\s*\|([\d\w\s\-()*—&;\[\]|#%.<>·:/",\'!{}=•?’
á~ü°œéö$àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ]*?)(?=\n))+

我在用python


Tags: 字符串文本thomasdavidsmithwilliamadamsocrates
1条回答
网友
1楼 · 发布于 2024-04-29 04:49:02

可以将列表理解与一些正则表达式结合使用:

import re
string = """
influences = 
 {{hlist |[[Plato]] |[[Aristotle]] |[[Socrates]] |[[David Hume]] |[[Adam Smith]] |[[Cicero]] |[[John Locke]]}}
 {{hlist |[[Saint Augustine]] |[[Saint Thomas Aquinas]] |[[Saint Thomas More]] |[[Richard Hooker]] |[[Edward Coke]]}}
 {{hlist |[[Thomas Hobbes]] |[[Rene Descartes]] |[[Montesquieu]] |[[Joshua Reynolds]] |[[Sir William Blackstone|William Blackstone]]}}
 {{hlist |[[Niccolo Machiavelli]] |[[Dante Alighieri]] |[[Samuel Johnson]] |[[Voltaire]] |[[Jean Jacques Rousseau]] |[[Jeremy Bentham]]}}
"""

matches = [template.group(1) 
           for match in re.findall(r'\{\{hlist.+?\}}', string)
           for template in re.finditer(r'\[\[([^]]+)\]\]', match)]
print(matches)
# ['Plato', 'Aristotle', 'Socrates', 'David Hume', 'Adam Smith', 'Cicero', 'John Locke', 'Saint Augustine', 'Saint Thomas Aquinas', 'Saint Thomas More', 'Richard Hooker', 'Edward Coke', 'Thomas Hobbes', 'Rene Descartes', 'Montesquieu', 'Joshua Reynolds', 'Sir William Blackstone|William Blackstone', 'Niccolo Machiavelli', 'Dante Alighieri', 'Samuel Johnson', 'Voltaire', 'Jean Jacques Rousseau', 'Jeremy Bentham']

它使用两个表达式,一个用于外部部分({{hlist...}}),另一个用于内部部分([[...]])。


a demo on regex101.com

相关问题 更多 >