<p>将输入内容从<code><\key></code>更新为<code></key></code>,并删除了<code>dict</code>标记,因为没有为此定义键。在</p>
<ol>
<li>通过<code>lxml.html</code>模块解析XML数据。在</li>
<li>通过<code>xpath()</code>方法获取目标main<code>dict</code>标记。在</li>
<li>调用<code>XMLtoDict()</code>函数。在</li>
<li>通过<code>getchildren()</code>方法和<code>for</code>循环迭代输入标记的子级。在</li>
<li>通过<code>if</code>循环检查标记名是否为键。在</li>
<li>如果是,则通过<code>getnext()</code>方法获取当前标记的下一个标记。在</li>
<li>如果下一个标记是<code>integer</code>标记,则获取值类型<code>int</code>。在</li>
<li>如果下一个标记是<code>string</code>标记,那么值类型是<code>string</code>。在</li>
<li>如果下一个标记是<code>dict</code>标记,那么值类型是<code>dict</code>,并再次调用函数,即递归调用。在</li>
<li>将键和值添加到结果字典中。在</li>
<li>返回结果字典。在</li>
<li>打印结果字典。在</li>
</ol>
<p>代码:</p>
<pre><code>data = """<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">
<dict>
<key>Version</key>
<integer>1</integer>
<key>Tracks</key>
<dict>
<key>0001</key>
<dict>
<key>Name</key><string>spam</string>
<key>Detail</key><string>spam spam</string>
</dict>
<key>0002</key>
<dict>
<key>Name</key><string>ham</string>
<key>Detail</key><string>ham ham</string>
</dict>
</dict>
</dict>
</plist>
"""
def XMLtoDict(root):
result = {}
for i in root.getchildren():
if i.tag=="key":
key = i.text
next_tag = i.getnext()
next_tag_name = next_tag.tag
if next_tag_name=="integer":
value = int(next_tag.text)
elif next_tag_name=='string':
value = next_tag.text
elif next_tag_name=='dict':
value = XMLtoDict(next_tag)
else:
value = None
result[key] = value
return dict(result)
import lxml.html as ET
import pprint
root = ET.fromstring(data)
result = XMLtoDict(root.xpath("//plist/dict")[0])
pprint.pprint(result)
</code></pre>
<p>输出:</p>
^{pr2}$
<hr/>
<ol>
<li><p>我没有得到这样的例外。在</p>
<p>(Unicode错误)“UnicodeScape”编解码器无法解码字节…</p></li>
<li><p>标签在中不正确库.xml在</p>
<p>进口xml.etree.ElementTree作为ET
树=ET.解析('库.xml')</p></li>
</ol>
<p>获取以下输入异常</p>
<pre><code>vivek@vivek:~/Desktop/stackoverflow$ python 12.py
Traceback (most recent call last):
File "12.py", line 46, in <module>
tree = ET.parse('library.xml')
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
self._raiseerror(v)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 4, column 15
</code></pre>
<p>此异常是由无效标记引起的。要修复此异常,请执行以下操作:</p>
<p>从<code><key>Version<\key></code>更改为<code><key>Version</key></code></p>
<ol start=“3”>
<li>通过<code>xml.etree.ElementTree</code>模块:</li>
</ol>
<p>代码:</p>
<pre><code>def XMLtoDict(root):
result = {}
chidren_tags = root.getchildren()
for j, i in enumerate(chidren_tags):
if i.tag=="key":
key = i.text
next_tag = chidren_tags[j+1]
next_tag_name = next_tag.tag
if next_tag_name=="integer":
value = int(next_tag.text)
elif next_tag_name=='string':
value = next_tag.text
elif next_tag_name=='dict':
value = XMLtoDict(next_tag)
else:
value = None
result[key] = value
return dict(result)
def XMLtoList(root):
result = []
chidren_tags = root.getchildren()
for j, i in enumerate(chidren_tags):
if i.tag=="key":
key = i.text
next_tag = chidren_tags[j+1]
next_tag_name = next_tag.tag
if next_tag_name=="integer":
value = int(next_tag.text)
elif next_tag_name=='string':
value = next_tag.text
elif next_tag_name=='dict':
value = XMLtoList(next_tag)
else:
value = None
result.append([key, value])
return list(result)
import xml.etree.ElementTree as ET
import pprint
tree = ET.parse('library.xml')
root = tree.getroot()
dict_tag = root.find("dict")
if dict_tag is not None:
result = XMLtoDict(dict_tag)
print "Result in Dictinary:-"
pprint.pprint(result)
result = XMLtoList(dict_tag)
print "\nResult in Dictinary:-"
pprint.pprint(result)
</code></pre>
<p>输出:
vivek@vivek公司:~/Desktop/stackoverflow$python 12.py</p>
<pre><code>Result in Dictinary:-
{'Tracks': {'0001': {'Detail': 'spam spam', 'Name': 'spam'},
'0002': {'Detail': 'ham ham', 'Name': 'ham'}},
'Version': 1}
Result in Dictinary:-
[['Version', 1],
['Tracks',
[['0001', [['Name', 'spam'], ['Detail', 'spam spam']]],
['0002', [['Name', 'ham'], ['Detail', 'ham ham']]]]]]
</code></pre>