用Python解析XML文档

2024-04-24 15:27:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个相当复杂的XML文档,至少对我来说,其中包含一些信息,我尝试检查lxml库以完成任务,但是我遇到了困难。你知道吗

我拥有的XML文档非常类似于下面的文档:

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
    <measCollecFile
        xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
        <fileHeader fileFormatVersion="32.435 V8.0.0"
            vendorName="Nokia">
            <fileSender
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                elementType="pgw instance 1" />
            <measCollec beginTime="2019-05-14T12:00:01-03:00" />
        </fileHeader>
        <measData>
            <managedElement
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                swVersion="C-10.0.R9" />
            <measInfo measInfoId="KPISystemCP-ISA">
                <granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
                <measType p="1">VS.avgCpuUtilization</measType>
                <measType p="2">VS.avgMemoryUtilization</measType>
                <measType p="3">VS.avgMemoryUtilization1M</measType>
                <measType p="4">VS.SDFsFpUtilization</measType>
                <measType p="5">VS.SDFsLcpUtilization</measType>
                <measType p="6">VS.avgVmFpCpuNicUsage</measType>
                <measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
                <measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
                <measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
                <measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
                <measType p="11">VS.hwCfgBitsInfo</measType>
                <measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
                    <r p="1">1</r>
                    <r p="2">72</r>
                    <r p="3">72</r>
                    <r p="4">0.00</r>
                    <r p="5">0.00</r>
                    <r p="6">0.00</r>
                    <r p="7">0.05</r>
                    <r p="8">0.00</r>
                    <r p="9">0.00</r>
                    <r p="10">0.00</r>
                    <r p="11">4</r>
                    <suspect>false</suspect>
                </measValue>
            </measInfo>

我想知道我怎样才能了解平均利用率使用python。你知道吗

我看得出与平均内存利用率值是72,但如何使用lxml库从python访问它?你知道吗


Tags: 文档xmllxmlvsxslmccmncmanagedelement
2条回答

您可以使用BeautifulSoup来解析XML数据(优点是您可以使用CSS选择器,XML的格式可能不正确等等):

from bs4 import BeautifulSoup

data = '''    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
    <measCollecFile
        xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
        <fileHeader fileFormatVersion="32.435 V8.0.0"
            vendorName="Nokia">
            <fileSender
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                elementType="pgw instance 1" />
            <measCollec beginTime="2019-05-14T12:00:01-03:00" />
        </fileHeader>
        <measData>
            <managedElement
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                swVersion="C-10.0.R9" />
            <measInfo measInfoId="KPISystemCP-ISA">
                <granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
                <measType p="1">VS.avgCpuUtilization</measType>
                <measType p="2">VS.avgMemoryUtilization</measType>
                <measType p="3">VS.avgMemoryUtilization1M</measType>
                <measType p="4">VS.SDFsFpUtilization</measType>
                <measType p="5">VS.SDFsLcpUtilization</measType>
                <measType p="6">VS.avgVmFpCpuNicUsage</measType>
                <measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
                <measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
                <measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
                <measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
                <measType p="11">VS.hwCfgBitsInfo</measType>
                <measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
                    <r p="1">1</r>
                    <r p="2">72</r>
                    <r p="3">72</r>
                    <r p="4">0.00</r>
                    <r p="5">0.00</r>
                    <r p="6">0.00</r>
                    <r p="7">0.05</r>
                    <r p="8">0.00</r>
                    <r p="9">0.00</r>
                    <r p="10">0.00</r>
                    <r p="11">4</r>
                    <suspect>false</suspect>
                </measValue>
            </measInfo>'''

soup = BeautifulSoup(data, 'xml')
p = soup.select_one('measType[p]:contains("VS.avgMemoryUtilization1M")')['p']
print('Value of `VS.avgMemoryUtilization1M`={}'.format(soup.select_one('r[p="{}"]'.format(p)).text))

印刷品:

Value of `VS.avgMemoryUtilization1M`=72

使用pythonxml.etree.ElementTree文件你知道吗

import xml.etree.ElementTree as ET
import re

data = '''<?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
    <measCollecFile
        xmlns="http://www.3gpp.org/ftp/specs/archive/32_series/32.435#measCollec">
        <fileHeader fileFormatVersion="32.435 V8.0.0"
            vendorName="Nokia">
            <fileSender
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                elementType="pgw instance 1" />
            <measCollec beginTime="2019-05-14T12:00:01-03:00" />
        </fileHeader>
        <measData>
            <managedElement
                localDn="MCC=096,MNC=724,ManagedElement=SAEGW01LEM"
                swVersion="C-10.0.R9" />
            <measInfo measInfoId="KPISystemCP-ISA">
                <granPeriod duration="PT300S" endTime="2019-05-14T12:05:01-03:00" />
                <measType p="1">VS.avgCpuUtilization</measType>
                <measType p="2">VS.avgMemoryUtilization</measType>
                <measType p="3">VS.avgMemoryUtilization1M</measType>
                <measType p="4">VS.SDFsFpUtilization</measType>
                <measType p="5">VS.SDFsLcpUtilization</measType>
                <measType p="6">VS.avgVmFpCpuNicUsage</measType>
                <measType p="7">VS.avgVmFpCpuWorkerUsage</measType>
                <measType p="8">VS.avgVmFpCpuSchedulerUsage</measType>
                <measType p="9">VS.avgVmFpCpuCollapsedUsage</measType>
                <measType p="10">VS.avgVmFpCpuCombinedUsage</measType>
                <measType p="11">VS.hwCfgBitsInfo</measType>
                <measValue measObjLdn="KPI=System,GroupName=CP-ISA,group=1,slot=3,mda=1">
                    <r p="1">1</r>
                    <r p="2">72</r>
                    <r p="3">72</r>
                    <r p="4">0.00</r>
                    <r p="5">0.00</r>
                    <r p="6">0.00</r>
                    <r p="7">0.05</r>
                    <r p="8">0.00</r>
                    <r p="9">0.00</r>
                    <r p="10">0.00</r>
                    <r p="11">4</r>
                    <suspect>false</suspect>
                </measValue>
            </measInfo>
        </measData>
    </measCollecFile>
    '''

data = re.sub(' xmlns="[^"]+"', '', data, count=1)
root = ET.fromstring(data)
# look for measType at offset 3 and take its p val
p_val = root.find('.//measType[3]').attrib['p']
print(root.find(".//r/[@p='{}']".format(p_val)).text)

输出

72

相关问题 更多 >