在Python中使用ElementTree发出命名空间规范
我正在尝试使用element-tree生成一个包含XML声明和命名空间的XML文件。以下是我的示例代码:
from xml.etree import ElementTree as ET
ET.register_namespace('com',"http://www.company.com") #some name
# build a tree structure
root = ET.Element("STUFF")
body = ET.SubElement(root, "MORE_STUFF")
body.text = "STUFF EVERYWHERE!"
# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)
tree.write("page.xml",
xml_declaration=True,
method="xml" )
但是,生成的文件里既没有出现<?xml
标签,也没有任何命名空间或前缀信息。我对此感到有些困惑。
2 个回答
9
我从来没办法通过代码把 <?xml
标签从元素树库中提取出来,所以我建议你可以试试下面这种方法。
from xml.etree import ElementTree as ET
root = ET.Element("STUFF")
root.set('com','http://www.company.com')
body = ET.SubElement(root, "MORE_STUFF")
body.text = "STUFF EVERYWHERE!"
f = open('page.xml', 'w')
f.write('<?xml version="1.0" encoding="UTF-8"?>' + ET.tostring(root))
f.close()
不同的非标准库的 Python ElementTree 实现可能在指定命名空间时有不同的方式,所以如果你决定换用 lxml,声明这些命名空间的方法也会有所不同。
52
虽然文档上说的可能不一样,但我发现只有在同时指定了xml_declaration和编码后,才能得到一个<?xml>
声明。
你需要在你注册的命名空间中声明节点,才能在文件中的节点上看到这个命名空间。下面是你代码的修正版本:
from xml.etree import ElementTree as ET
ET.register_namespace('com',"http://www.company.com") #some name
# build a tree structure
root = ET.Element("{http://www.company.com}STUFF")
body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF")
body.text = "STUFF EVERYWHERE!"
# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)
tree.write("page.xml",
xml_declaration=True,encoding='utf-8',
method="xml")
输出 (page.xml)
<?xml version='1.0' encoding='utf-8'?><com:STUFF xmlns:com="http://www.company.com"><com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF></com:STUFF>
ElementTree也不会自动美化输出。这里是经过美化的输出:
<?xml version='1.0' encoding='utf-8'?>
<com:STUFF xmlns:com="http://www.company.com">
<com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF>
</com:STUFF>
你还可以声明一个默认的命名空间,这样就不需要再注册一个了:
from xml.etree import ElementTree as ET
# build a tree structure
root = ET.Element("{http://www.company.com}STUFF")
body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF")
body.text = "STUFF EVERYWHERE!"
# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)
tree.write("page.xml",
xml_declaration=True,encoding='utf-8',
method="xml",default_namespace='http://www.company.com')
输出 (美化后的格式是我自己调整的)
<?xml version='1.0' encoding='utf-8'?>
<STUFF xmlns="http://www.company.com">
<MORE_STUFF>STUFF EVERYWHERE!</MORE_STUFF>
</STUFF>