<p><em>不要使用regexp:</em></p>
<p>这就是为什么在处理HTML或XML(或url)时应该首先使用<a href="https://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg">not think at regex</a>。在</p>
<p><em>如果您仍然希望使用regex,</em></p>
<p>您可以找到几种完成这项工作的模式,以及获取您希望找到的字符串的几种方法。在</p>
<p>这些模式起到了作用:</p>
<p><code>r'\(a href="(.*?)"\)'</code></p>
<p><code>r'\(a href="(.*)"\)'</code></p>
<p><code>r'\(a href="(+*)"\)'</code></p>
<p><strong>1。关于芬德尔()</strong></p>
<pre><code>re.findall(pattern, string, flags=0)
</code></pre>
<blockquote>
<p>Return all non-overlapping matches of pattern in string, as a list of
strings. The string is scanned left-to-right, and matches are returned
in the order found. If one or more groups are present in the pattern,
return a list of groups; this will be a list of tuples if the pattern
has more than one group. Empty matches are included in the result
unless they touch the beginning of another match.</p>
</blockquote>
^{pr2}$
<p><strong>2。搜索()</strong></p>
<p><code>re.search(pattern, string, flags=0)</code></p>
<blockquote>
<p>Scan through string looking for a location where the regular
expression pattern produces a match, and return a corresponding
MatchObject instance.</p>
</blockquote>
<p>然后,按<code>re.group()</code>分组。例如,使用regex <code>r'\(a href="(.+?(.).+?)"\)'</code>,这也适用于这里,您有几个封闭的组:组0与整个模式匹配,组1与第一个用括号括起来的封闭子模式匹配,<code>(.+?(.).+?)</code></p>
<p>当只查找第一次出现的模式时,可以使用search<strong>。以你的例子来说</p>
<pre><code>>>> st = 'blahblahblah (a href="example.com") another bla (a href="polymer.edu")'
>>> m=re.search(r'\(a href="(.+?(.).+?)"\)', st)
>>> m.group(1)
'example.com'
</code></pre>