用Python3.2从头开始创建Unicode XML

from xml.etree import ElementTree as et def dict_to_elem(dictionary): item = et.Element('Item') for key in dictionary: field = et.Element(key.replace(' ','')) field.text = dictionary[key] item.append(field) return item newtree = et.ElementTree() root = et.Element('AllItems') newtree._setroot(root) root.append(dict_to_elem( {'some_tag':'Hello World', ...} ) # Lather, rinse, repeat this append step as needed with open( filename , 'w', encoding='utf-8') as file: tree.write(file, encoding='unicode')

2条回答

网友

1楼 · 编辑于 2024-04-19 13:57:58

不必这样做，但是如果工具不理解生成的xml文件，可以显式地添加xml声明：

#!/usr/bin/env python3
from xml.etree import ElementTree as etree

your_dict = {'some_tag': 'Hello World ☺'}

def add_items(root, items):
    for name, text in items:
        elem = etree.SubElement(root, name)
        elem.text = text

root = etree.Element('AllItems')
add_items(etree.SubElement(root, 'Item'),
          ((key.replace(' ', ''), value) for key, value in your_dict.items()))
tree = etree.ElementTree(root)
tree.write('output.xml', xml_declaration=True, encoding='utf-8')

output.xml:

<?xml version='1.0' encoding='utf-8'?>
<AllItems><Item><some_tag>Hello World ☺</some_tag></Item></AllItems>

网友

2楼 · 编辑于 2024-04-19 13:57:58

在我看来，ElementTree是一个不错的选择。如果将来需要更强大的包，可以切换到使用相同接口的第三方lxml模块。

您的问题的答案可以在文档http://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.ElementTree.write中找到

The output is either a string (str) or binary (bytes). This is controlled by the encoding argument. If encoding is "unicode", the output is a string; otherwise, it’s binary. Note that this may conflict with the type of file if it’s an open file object; make sure you do not try to write a string to a binary stream and vice versa.

基本上，你做得对。以文本模式open()文件，这样文件接受字符串，并且需要使用'unicode'参数作为tree.write()。否则，可以以二进制模式打开文件（在open()中没有编码参数），并在tree.write()中使用'utf-8'。

稍微清理一下独立工作的代码：

#!python3
from xml.etree import ElementTree as et

def dict_to_elem(dictionary):
    item = et.Element('Item')
    for key in dictionary:
        field = et.Element(key.replace(' ',''))
        field.text = dictionary[key]
        item.append(field)
    return item

root = et.Element('AllItems')     # create the element first...
tree = et.ElementTree(root)       # and pass it to the created tree

root.append(dict_to_elem(  {'some_tag':'Hello World', 'xxx': 'yyy'}  ))
# Lather, rinse, repeat this append step as needed

filename = 'a.xml'
with open(filename, 'w', encoding='utf-8') as file:
    tree.write(file, encoding='unicode')

# The alternative is...    
fname = 'b.xml'
with open(fname, 'wb') as f:
    tree.write(f, encoding='utf-8')

这取决于目的。在这两个问题中，我个人更喜欢第一个解决方案。它清楚地表明您编写了一个文本文件（而XML是一个文本文件）。

但不需要告诉编码的最简单的替代方法是将文件名传递给tree.write，如下所示：

tree.write('c.xml', encoding='utf-8')

它打开文件，使用给定的编码写入内容，然后关闭文件。你读起来很容易，在这里也不会出错。

output.xml:

相关问题更多 >

编程相关推荐

热门问题

热门文章