访问具有异常xml结构的文件夹中的xml文件

2024-04-23 12:08:49 发布

您现在位置:Python中文网/ 问答频道 /正文

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><document DateTime="2017-06-23T04:27:08.592Z"><PeakInfo No="1" mz="505.2315648572003965" Intensity="4531.0000000000000000" Rel_Intensity="3.2737729673489735" Resolution="1879.5638812957554364" SNR="14.0278637770897561" Area="1348.1007591467391649" Rel_Area="2.3371194184605959" Index="238.9999999999976694"/><PeakInfo No="2" mz="522.1330917856538463" Intensity="3382.0000000000000000" Rel_Intensity="2.4435886505350317" Resolution="3502.9921209527169594" SNR="10.4705882352940982" Area="881.4468100654634100" Rel_Area="1.5281101521284057" Index="925.0000000000000000"/></document>

上面是我需要解析的xml文件的一部分。我看了一些关于如何解析/提取xml文件的youtube视频,由于某种原因,它们所涵盖的内容似乎并不适用于我的xml文件。如果我没有弄错的话,我知道这些PeakInfo是元素。但是,我似乎无法访问每个PeakInfo编号的mz值和强度值

import xml.etree.ElementTree as ET
import os

file_name = 'E7.xml'
full_file = os.path.abspath(os.path.join('xmllist', file_name))

pl = ET.parse(full_file)

peakinfos = pl.findall('PeakInfo')

for p in peakinfos:
    mz = p.find('mz')
    print(mz)

以上是我根据一些youtube视频编写的代码。在这里,我尝试从PeakInfo元素访问mz值,但没有成功。我能做些什么来实现我想要的

编辑: 打印(pl)结果为:xml.etree.ElementTree.ElementTree对象


Tags: 文件noosareaxmldocumentfilerel
1条回答
网友
1楼 · 发布于 2024-04-23 12:08:49
s = '''<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
       <document DateTime="2017-06-23T04:27:08.592Z">
           <PeakInfo No="1" mz="505.2315648572003965"
                     Intensity="4531.0000000000000000"
                     Rel_Intensity="3.2737729673489735"
                     Resolution="1879.5638812957554364"
                     SNR="14.0278637770897561"
                     Area="1348.1007591467391649"
                     Rel_Area="2.3371194184605959"
                     Index="238.9999999999976694"/>
           <PeakInfo No="2" mz="522.1330917856538463"
                     Intensity="3382.0000000000000000"
                     Rel_Intensity="2.4435886505350317"
                     Resolution="3502.9921209527169594"
                     SNR="10.4705882352940982"
                     Area="881.4468100654634100"
                     Rel_Area="1.5281101521284057"
                     Index="925.0000000000000000"/>
       </document>'''

import xml.etree.ElementTree as ET

root = ET.fromstring(s)
peakinfos = root.findall('PeakInfo')

findall正在查找元素,您正在尝试访问元素属性。
使用attribget访问值

for p in peakinfos:
    print 'mz is ...', p.get('mz')
    print 'mz is ...', p.attrib['mz']
    for k,v in p.items():
        print '{}: {}'.format(k,v)
    print '             '

相关问题 更多 >