regex捕获括号中的文本，省略可选前缀

s = 'including the [[Royal Danish Academy of Sciences and Letters|Danish Academy of Sciences]], [[Norwegian Academy of Science and Letters|Norwegian Academy of Sciences]], [[Russian Academy of Sciences]], and [[National Academy of Sciences|US National Academy of Sciences]].' re.sub('\[\[(.*?\|)(.*?)\]\]','\\2', # case 1 re.sub('\[\[([^|]+)\]\]','\\1',s) # case 2 ) # result is correct: 'including the Danish Academy of Sciences, Norwegian Academy of Sciences, Russian Academy of Sciences, and US National Academy of Sciences.'

re.sub('\[\[([^|]*\|)?(.*?)\]\]','\\2',s) # does NOT return the desired result: 'including the Danish Academy of Sciences, Norwegian Academy of Sciences, US National Academy of Sciences.' # is missing: 'Russian Academy of Sciences, and '

1条回答

网友

1楼 · 发布于 2024-06-16 10:31:29

See regex in use here

\[{2}(?:(?:(?!]{2})[^|])+\|)*((?:(?!]{2})[^|])+)]{2}

\[{2}匹配[[
(?:(?:(?!]{2})[^|])+\|)*匹配下列任意次数
- (?:(?!]{2})[^|])+Tempered greedy token匹配任何字符一次或多次，但|或匹配]]的位置除外
- \|匹配|字面意思
((?:(?!]{2})[^|])+)将以下内容捕获到捕获组1中
- (?:(?!]{2})[^|])+Tempered greedy token匹配任何字符一次或多次，但|或匹配]]的位置除外
]{2}匹配]]

替换\1

结果：

including the Danish Academy of Sciences, Norwegian Academy of Sciences, Russian Academy of Sciences, and US National Academy of Sciences.

另一个对你有用的选择是。它没有上面的正则表达式那么具体，但是不包括任何lookaround。你知道吗

\[{2}(?:[^]|]+\|)*([^]|]+)]{2}

嵌套方法（产生所需结果）

单通道方法（此处寻找解决方案）

相关问题更多 >

编程相关推荐

热门问题

热门文章