如何在Python 3.3中使用ElementTree选择特定元素的所有后代?
这是一个示例数据。
输入文件是 input.xml
<root>
<entry id="1">
<headword>go</headword>
<example>I <hw>go</hw> to school.</example>
</entry>
</root>
我想把某个节点和它的所有子节点放到一个新的地方,也就是,
输出文件是 output.xml
<root>
<entry id="1">
<headword>go</headword>
<examplegrp>
<example>I <hw>go</hw> to school.</example>
</examplegrp>
</entry>
</root>
我写的这个不太完整的脚本是:
import codecs
import xml.etree.ElementTree as ET
fin = codecs.open(r'input.xml', 'rb', encoding='utf-8')
data = ET.parse(fin)
root = data.getroot()
example = root.find('.//example')
for elem in example.iter():
---and then I don't know what to do---
2 个回答
0
http://docs.python.org/3/library/xml.dom.html?highlight=xml#node-objects http://docs.python.org/3/library/xml.dom.html?highlight=xml#document-objects
你可能想要遵循一种创建文档元素的方式,然后把每个结果都加到这个元素里。
group = Document.createElement(tagName)
for found in founds:
group.appendNode(found)
或者类似这样做
0
这里有一个例子,展示了怎么做:
text = """
<root>
<entry id="1">
<headword>go</headword>
<example>I <hw>go</hw> to school.</example>
</entry>
</root>
"""
import lxml.etree
import StringIO
data = lxml.etree.parse(StringIO.StringIO(text))
root = data.getroot()
for entry in root.xpath('//example/ancestor::entry[1]'):
examplegrp = lxml.etree.SubElement(entry,"examplegrp")
nodes = [node for node in entry.xpath('./example')]
for node in nodes:
entry.remove(node)
examplegrp.append(node)
print lxml.etree.tostring(root,pretty_print=True)
这个代码会输出:
<root>
<entry id="1">
<headword>go</headword>
<examplegrp><example>I <hw>go</hw> to school.</example>
</examplegrp></entry>
</root>