擅长:python、mysql、java
<p>你的问题可能是你使用了<code>\r\n</code>。相反,请尝试仅使用<code>\n</code>:</p>
<pre>
>>> x = """
... >U51677 Human non-histone chromatin protein HMG1 (HMG1) gene, complete
...
... cds. #some records don't have this line (see below)
...
... Length = 2575
... (some text)
...
... >U51677 Human non-histone chromatin protein HMG1 (HMG1) gene, complete
...
... Length = 2575
... (some text)
...
... (etc...)
... """
>>> re.search("^(>.*)\n.*(?:\n*.?)Length\s=\s(\d+)", x, re.MULTILINE|re.DOTALL)
<_sre.SRE_Match object at 0x10c937e00>
>>> _.group(2)
'2575'
</pre>
<p>另外,你的第一个<code>.*</code>太贪婪了。相反,请尝试使用:<code>^(>.*?)$.*?Length\s=\s(\d+)</code>:</p>
<pre>
>>> re.findall("^(>.*?)$.*?Length\s=\s(\d+)", x, re.MULTILINE|re.DOTALL)
[('>U51677 Human non-histone chromatin protein HMG1 (HMG1) gene, complete', '2575'), ('>U51677 Human non-histone chromatin protein HMG1 (HMG1) gene, complete', '2575')]
</pre>