如何获取捕获组的所有匹配迭代

2024-05-23 14:45:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我做了一个正则表达式并和关于芬德尔():

SELECT.*{(?:\[([a-zA-Z0-9 ]*)\]\.\[([a-zA-Z0-9 ]*)\]\.\[([a-zA-Z0-9 ]*)\][,]{0,1}){1,}}.*

made with https://jex.im

要匹配这些字符串列表:

["dimSales","Product Title","All"], ["test","Product Title","All"]

大海捞针:

SELECT NON EMPTY Hierarchize({DrilldownLevel({[dimSales].[Product Title].[All],[test].[Product Title].[All]},,,INCLUDE_CALC_MEMBERS)}) DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS FROM [Model] CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

我的正则表达式只匹配外部捕获组的最后一次迭代

["test","Product Title","All"]

我需要改变什么关于芬德尔()返回所有迭代。不仅仅是外捕获组的最后一次迭代?你知道吗


Tags: 字符串nametest列表titlepropertiesallproduct
2条回答

这个正则表达式呢:

(\[\"[^\"]*\",\"[^\"]*\",\"[^\"]*\"\],\s*\[\"[^\"]*\",\"[^\"]*\",\"[^\"]*\"\])

演示:

https://regex101.com/r/LaddaK/2/

解释:

  • 圆括号()用于创建捕获组,如果不需要,可以删除
  • \[\"[^\"]*\",\"[^\"]*\",\"[^\"]*\"\]要匹配开括号,请按字面顺序后跟双引号、0到N个非双引号字符([^\"]*),后跟双引号和逗号。如果要接受逗号周围的空格字符,则可能必须将所有逗号用\s*括起来。你知道吗
  • 再重复2次模式\"[^\"]*\",以匹配括号中的前3个单词(根据字符串的确切约束,可能需要调整为\w*)。你知道吗
  • ,\s*之后重复整个块[\"[^\"]*\",\"[^\"]*\",\"[^\"]*\"\],以接受由两个方括号块组成的整个模式。你知道吗

备注:

  • 您可能希望用锚(^$)包围regex

  • I don't know exactly your constraints but if you want to analyse some JSON or parse any other format with infinite nested patterns repeating themselves (ex: fractals) you should not use regex.

更改要求后编辑:

import re

inputStr = '[dimSales,Product Title,All], [test,Product Title,All]'
print(re.findall(r'\[(?:[a-zA-Z0-9 ]*)(?:,[a-zA-Z0-9 ]*)*\]', inputStr))

输出:

['[dimSales,Product Title,All]', '[test,Product Title,All]']
string = "SELECT NON EMPTY Hierarchize({DrilldownLevel({[dimSales].[Product Title].[All],[test].[Product Title].[All]},,,INCLUDE_CALC_MEMBERS)}) DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS FROM [Model] CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS"

print re.findall(r"(?:SELECT .+\({|,)\[([\w ]+)\]\.\[([\w ]+)\]\.\[([\w ]+)\](?=[^}]*})",  string)

输出:

[('dimSales', 'Product Title', 'All'), ('test', 'Product Title', 'All')]

说明:

(?:SELECT .+\({|,)      # non capture group, match SELECT folowed by 1 or more any character then ({ OR a comma
\[([\w ]+)\]            # group 1, 1 or more word character or space inside square brackets
\.                      # a dot
\[([\w ]+)\]            # group 2, 1 or more word character or space inside square brackets
\.                      # a dot
\[([\w ]+)\]            # group 3, 1 or more word character or space inside square brackets
(?=[^}]*})              # positive lookahead, make sure we have after a close curly bracket not preceeded by another curly bracket

相关问题 更多 >