<p>可以使用与除右方括号<code>\[[^[\]]+\]</code>之外的任何字符匹配的否定字符类稍微优化模式</p>
<p>在第五组中,你可以重复匹配相同的模式,得到一整组</p>
<pre><code>(>ref\|NC_\d+\|)( \[[^[\]]+\])( \[[^[\]]+\])( \[[^[\]]+\])( \[[^[\]]+\](?: \[[^[\]]+\])*)
</code></pre>
<p><a href="https://regex101.com/r/B4lelG/1" rel="nofollow noreferrer">Regex demo</a><a href="https://ideone.com/PhpifC" rel="nofollow noreferrer">Python demo</a></p>
<p>在替代使用中</p>
<pre><code>>\5\2\3\4[\1]
</code></pre>
<p>例如,使用<a href="https://docs.python.org/3/library/re.html#re.sub" rel="nofollow noreferrer">re.sub</a></p>
<pre><code>import re
regex = r"(>ref\|NC_\d+\|)( \[[^[\]]+\])( \[[^[\]]+\])( \[[^[\]]+\])( \[[^[\]]+\](?: \[[^[\]]+\])*)"
test_str = (">ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]\n"
">ref|NC_001224| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [location=mitochondrion] [top=circular]")
subst = r">\5\2\3\4[\1]"
result = re.sub(regex, subst, test_str)
print (result)
</code></pre>
<p>输出</p>
<pre><code>> [chromosome=I] [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic][>ref|NC_001133|]
> [location=mitochondrion] [top=circular] [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic][>ref|NC_001224|]
</code></pre>