擅长:python、mysql、java
<p>作为公共服务,为那些可能和我一样懒惰的人服务。这是上面的一些代码,你可以运行。</p>
<pre><code>from lxml import etree
def get_text1(node):
result = node.text or ""
for child in node:
if child.tail is not None:
result += child.tail
return result
def get_text2(node):
return ((node.text or '') +
''.join(map(get_text2, node)) +
(node.tail or ''))
def get_text3(node):
return (node.text or "") + "".join(
[etree.tostring(child) for child in node.iterchildren()])
root = etree.fromstring(u"<td> text1 <a> link </a> text2 </td>")
print root.xpath("text()")
print get_text1(root)
print get_text2(root)
print root.xpath("string()")
print etree.tostring(root, method = "text")
print etree.tostring(root, method = "xml")
print get_text3(root)
</code></pre>
<p>输出为:</p>
<pre><code>snowy:rpg$ python test.py
[' text1 ', ' text2 ']
text1 text2
text1 link text2
text1 link text2
text1 link text2
<td> text1 <a> link </a> text2 </td>
text1 <a> link </a> text2
</code></pre>