<p>我想用html解析下面的例子</p>
<p>示例是特定html的一部分。你知道吗</p>
<pre><code><p>NUCLEAR EK:</p>
<ul>
<li>2015-01-29 17:22:12 UTC - culturemerge.ga - GET /AgJVAhoAGFpMUAVU.html</li>
<li>2015-01-29 17:22:13 UTC - culturemerge.ga - GET /AU4STwAHU1NMUUlcSlMHVAFRVwJTB1RKVx1XA1ZMAVUFSgRWTwBfVg</li>
<li>2015-01-29 17:22:15 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoGGDpaIEUi</li>
<li>2015-01-29 17:22:17 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoGGBpgEF8mYRhdIk9W</li>
<li>2015-01-29 17:22:21 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoEGDpaIEUi</li>
<li>2015-01-29 17:22:22 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoEGBpgEF8mYRhdIk9W</li>
<li>2015-01-29 17:22:23 UTC - culturemerge.ga - GET /AU4STwAHU1NMUUlcSlMHVAFRVwJTB1RKVx1XA1ZMAVUFSgRWTxVaCBRVEA</li>
<li>2015-01-29 17:22:25 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoLGDpaIEUi</li>
<li>2015-01-29 17:22:28 UTC - culturemerge.ga - GET /Al8OVhpVUFUBHgYYDh4CUgFWVwVQBFYGHgZIAlRQHlMCVBhQBxoLGBpgEF8mYRhdIk9W</li>
</ul>
</code></pre>
<p>我想获取内容<code><p>~</ul></code></p>
<p>所以我做了如下pcre python代码:</p>
<pre><code>temp=re.findall(r"<p>[^\"\&\;]*?<\/p>\s*<ul>\s*<li>\d(.|\s)*?<\/ul>",html)
print temp
</code></pre>
<p>这个pcre在notepad++或Regex Coach中运行良好</p>
<p>但在python中,它无法解析!你知道吗</p>
<p>它只显示空列表,如<code>[]</code></p>
<pre><code> temp=re.finditer(r"<p>[^\"\&\;]*?<\/p>\s*<ul>\s*<li>\d(.|\s)*?<\/ul>",html)
for match in temp:
print match.group(0)
</code></pre>