擅长:python、mysql、java
<p>下面是一个非常简单的解决方案,它使用一个非贪婪的正则表达式来删除所有html标记:</p>
<pre><code>import re
s = "<div class = \"fix\"> part of text <div something> other text </div> some more text </div>"
s_text = re.sub(r'<.*?>', '', s)
</code></pre>
<p>这些值是:</p>
<pre><code>print(s)
<div class = "fix"> part of text <div something> other text </div> some more text </div>
print(s_text)
part of text other text some more text
</code></pre>