<p>我遇到了一种情况:</p>
<pre class="lang-html prettyprint-override"><code><div>
<script>
some code
</script>
text here
</div>
</code></pre>
<p><code>div.remove(script)</code>将删除我无意删除的<code>text here</code>部分。</p>
<p>在回答<a href="https://stackoverflow.com/questions/5418201/how-can-one-replace-an-element-with-text-in-lxml#answer-5420500">here</a>之后,我发现<code>etree.strip_elements</code>对我来说是一个更好的解决方案,您可以控制是否使用<code>with_tail=(bool)</code>参数删除后面的文本。</p>
<p>但我仍然不知道这是否可以对标记使用xpath过滤器。把这个放在通知处。</p>
<p>这是医生:</p>
<blockquote>
<p>strip_elements(tree_or_element, *tag_names, with_tail=True)</p>
<p>Delete all elements with the provided tag names from a tree or
subtree. This will remove the elements and their entire subtree,
including all their attributes, text content and descendants. It
will also remove the tail text of the element unless you
explicitly set the <code>with_tail</code> keyword argument option to False.</p>
<p>Tag names can contain wildcards as in <code>_Element.iter</code>.</p>
<p>Note that this will not delete the element (or ElementTree root
element) that you passed even if it matches. It will only treat
its descendants. If you want to include the root element, check
its tag name directly before even calling this function.</p>
<p>Example usage::</p>
<pre><code> strip_elements(some_element,
'simpletagname', # non-namespaced tag
'{http://some/ns}tagname', # namespaced tag
'{http://some/other/ns}*' # any tag from a namespace
lxml.etree.Comment # comments
)
</code></pre>
</blockquote>