<p>如果希望将File2合并到File1中,则可以循环遍历File2中的所有元素,然后将属性从File2的元素复制到File1的元素中。</p>
<p>在我正在做的一个项目中,我必须做类似的事情。这是我目前的解决方案,应该在Python2.7下工作。</p>
<p>注意,我进一步添加了在公共节点之间复制属性的需求。您将看到我将以下属性添加到:</p>
<ul>
<li>鼓声</li>
<li>bass='Geddy'</li>
</ul>
<p>然后我又加了一句:</p>
<ul>
<li>吉他='Alex'</li>
</ul>
<p>最后合并的文件中有三名权力三人组成员。</p>
<p>我还添加了<code><sentance id='3'/></code>来证明元素的顺序不再重要。</p>
<pre><code>#!/usr/bin/python
from lxml import etree
from copy import deepcopy
import lxml
xmlA='''
<book>
<chapter id="113">
<sentence id="1" drums='Neil'>
<word id="128160" bass='Geddy'>
<POS Tag="V"/>
<grammar type="STEM"/>
<Aspect type="IMPV"/>
<Number type="S"/>
</word>
<word id="128161">
<POS Tag="V"/>
<grammar type="STEM"/>
<Aspect type="IMPF"/>
</word>
</sentence>
<sentence id="2">
<word id="128162">
<POS Tag="P"/>
<grammar type="PREFIX"/>
<Tag Tag="bi+"/>
</word>
</sentence>
</chapter>
</book>
'''
xmlB='''
<book>
<chapter id="113">
<sentence id="3">
<word id="128168">
<concept English="sadness"/>
</word>
</sentence>
<sentence id="1">
<word id="128160">
<concept English="joke"/>
</word>
<word id="128161">
<concept English="romance"/>
</word>
</sentence>
<sentence id="2" guitar='Alex'>
<word id="128162">
<concept English="happiness"/>
</word>
</sentence>
</chapter>
</book>
'''
import re
from copy import deepcopy
##
# @brief Translates the relational xpath to an explicit xpath.
# In the XML examples above, getpath will return the following for
# <sentance id='1'/>:
# - xmlA = /book/chapter/sentance[1]
# - xmlb = /book/chapter/sentance[2]
#
# A path that is explicit in both document would be:
# - xmlA = /book/chapter/sentance[@id='1']
# - xmlb = /book/chapter/sentance[@id='1']
#
def convertXpath(element):
newPath = ''
tree = element.getroottree()
path = tree.getpath(element).split('/')
root = tree.getroot()
for p in path:
if p == '':
continue
if re.search('\[[0-9]*\]', p):
# Get the element at this path
#
node = root.xpath(newPath+'/'+p)[0]
id=node.get('id')
p=re.sub('\[[0-9]*\]','', p)
newPath += '/'+p+"[@id='"+id+"']"
else:
newPath+='/'+p
return newPath
def mergeXml(a,b):
for node in a.nodes():
path = convertXpath(node)
# find the element in the other document
#
elements = b.root.xpath(path)
for e in elements:
for name, value in node.items():
if name == 'id':
continue
e.set(name,value)
if len(elements) == 0:
# Add the node to other document
#
newElement = deepcopy(node)
# Find the path to the parent
#
parent = node.getparent()
path = convertXpath(parent)
bParent = b.root.xpath(path)[0]
bParent.append(newElement)
class XmlDoc:
def __init__(self, xml):
self.root = etree.fromstring(xml)
self.tree = self.root.getroottree()
def __str__(self):
return etree.tostring(self.root, pretty_print=True)
def nodes(self):
return self.root.iter('*')
if __name__ == '__main__':
a = XmlDoc(xmlA)
b = XmlDoc(xmlB)
mergeXml(a,b)
print b
</code></pre>
<p>这将产生以下输出:</p>
<pre><code><book>
<chapter id="113">
<sentence id="3">
<word id="128168">
<concept English="sadness"/>
</word>
</sentence>
<sentence id="1" drums="Neil">
<word id="128160" bass="Geddy">
<concept English="joke"/>
<POS Tag="V"/>
<grammar type="STEM"/>
<Aspect type="IMPV"/>
<Number type="S"/>
</word>
<word id="128161">
<concept English="romance"/>
<POS Tag="V"/>
<grammar type="STEM"/>
<Aspect type="IMPF"/>
</word>
</sentence>
<sentence id="2" guitar="Alex">
<word id="128162">
<concept English="happiness"/>
<POS Tag="P"/>
<grammar type="PREFIX"/>
<Tag Tag="bi+"/>
</word>
</sentence>
</chapter>
</book>
</code></pre>