Python正则表达式Tokeniz问题的回答

Python正则表达式Tokeniz

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

不要使用regexp: 这就是为什么在处理HTML或XML（或url）时应该首先使用<a href="https://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg">not think at regex</a>。在 如果您仍然希望使用regex， 您可以找到几种完成这项工作的模式，以及获取您希望找到的字符串的几种方法。在 这些模式起到了作用： <code>r'$a href="(.*?)"$'</code> <code>r'$a href="(.*)"$'</code> <code>r'$a href="(+*)"$'</code> 1。关于芬德尔（） <pre><code>re.findall(pattern, string, flags=0) </code></pre> <blockquote> Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match. </blockquote> ^{pr2}$ 2。搜索（） <code>re.search(pattern, string, flags=0)</code> <blockquote> Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. </blockquote> 然后，按<code>re.group()</code>分组。例如，使用regex <code>r'$a href="(.+?(.).+?)"$'</code>，这也适用于这里，您有几个封闭的组：组0与整个模式匹配，组1与第一个用括号括起来的封闭子模式匹配，<code>(.+?(.).+?)</code> 当只查找第一次出现的模式时，可以使用search。以你的例子来说 <pre><code>>>> st = 'blahblahblah (a href="example.com") another bla (a href="polymer.edu")' >>> m=re.search(r'$a href="(.+?(.).+?)"$', st) >>> m.group(1) 'example.com' </code></pre>

Python正则表达式Tokeniz

1 个回答

相关Python问题