Python lxml 按预定义顺序写入文件
我想在我的odm xml文件中写入以下lxml etree的子元素:
<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementItemGroupDefat0x38032c8>,
<ElementClinicalDataat0x3803408>,
<ElementItemGroupDataat0x38035c8>,
<ElementFormDefat0x38036c8>,
并且要按照预定义的顺序进行排列,也就是说:
<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementFormDefat0x38036c8>,
<ElementItemGroupDefat0x38032c8>,
<ElementItemGroupDataat0x38035c8>,
<ElementClinicalDataat0x3803408>,
....
有没有什么方法可以按照预先定义的列表来排序这些元素呢?
predefined_order = ['Protocol', 'StudyEventDef','FormDef','ItemGroupDef','ItemDef','CodeList']
2 个回答
0
抱歉我对XML了解不多,但我只是用我基本的Python知识尝试把你的数据整理成排序的形式。
import re
data = """<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementItemGroupDefat0x38032c8>,
<ElementClinicalDataat0x3803408>,
<ElementItemGroupDataat0x38035c8>,
<ElementFormDefat0x38036c8>,"""
predefined_order = ['Protocol','StudyEventDef','FormDef','ItemGroupDef','ItemGroupData','CodeList', 'ClinicalData']
fh1 = open("something.xml","w")
for i in predefined_order:
for j in data.split(','):
if re.search(i,j):
fh1.write(j + ',')
输出结果:
<ElementProtocolat0x3803048>,
<ElementStudyEventDefat0x3803108>,
<ElementFormDefat0x3803248>,
<ElementFormDefat0x38036c8>,
<ElementItemGroupDefat0x38032c8>,
<ElementItemGroupDataat0x38035c8>,
<ElementClinicalDataat0x3803408>,
5
这个示例展示了:
- 如何读取一个XML文件,
- 元素其实是一个列表,可以像列表一样进行操作,
- 如何根据一个可匹配的子字符串的预定义顺序来排序列表,
- 如何输出一个XML文件。
from lxml import etree
import re
# Parse the XML and find the root
with open('input.xml') as input_file:
tree = etree.parse(input_file)
root = tree.getroot()
# Find the list to sort and sort it
some_arbitrary_expression_to_find_the_list = '.'
element_list = tree.xpath(some_arbitrary_expression_to_find_the_list)[0]
predefined_order = [
'Protocol',
'StudyEventDef',
'FormDef',
'ItemGroupDef',
'ItemGroupData',
'ItemDef',
'CodeList',
'ClinicalData']
filter = re.compile(r'Element(.*)at0x.*')
element_list[:] = sorted(
element_list[:],
key = lambda x: predefined_order.index(filter.match(x.tag).group(1)))
# Write the XML to the output file
with open('output.xml', 'w') as output_file:
output_file.write(etree.tostring(tree, pretty_print = True))
示例输入:
<stuff>
<ElementProtocolat0x3803048 />
<ElementStudyEventDefat0x3803108 />
<ElementFormDefat0x3803248 />
<ElementItemGroupDefat0x38032c8>Random Text</ElementItemGroupDefat0x38032c8>
<ElementClinicalDataat0x3803408 />
<ElementItemGroupDataat0x38035c8><tag1><tag2 attr="random tags"/></tag1></ElementItemGroupDataat0x38035c8>
<ElementFormDefat0x38036c8 />
</stuff>
输出:
<stuff>
<ElementProtocolat0x3803048/>
<ElementStudyEventDefat0x3803108/>
<ElementFormDefat0x3803248/>
<ElementFormDefat0x38036c8/>
<ElementItemGroupDefat0x38032c8>Random Text</ElementItemGroupDefat0x38032c8>
<ElementItemGroupDataat0x38035c8><tag1><tag2 attr="random tags"/></tag1></ElementItemGroupDataat0x38035c8>
<ElementClinicalDataat0x3803408/>
</stuff>