Python lxml如何再次使用XPath对XPath结果进行后处理

2024-04-25 22:30:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下XML文件:

<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform(version=1.7.8) -->
<MPD
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:mpeg:dash:schema:mpd:2011"
        xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
        type="static"
        mediaPresentationDuration="PT1H43M36.832S"
        maxSegmentDuration="PT3S"
        minBufferTime="PT10S"
        profiles="urn:mpeg:dash:profile:isoff-live:2011,urn:com:dashif:dash264">
    <Period>
        <BaseURL>dash/</BaseURL>
        <AdaptationSet group="1" contentType="audio" lang="tr" minBandwidth="157405" maxBandwidth="157405"
                       segmentAlignment="true" audioSamplingRate="48000" mimeType="audio/mp4" codecs="mp4a.40.2">
            <AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2">
            </AudioChannelConfiguration>
            <Representation id="audio_tur=157405" bandwidth="157405">
            </Representation>
        </AdaptationSet>
        <AdaptationSet group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000"
                       minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true"
                       frameRate="25" mimeType="video/mp4" startWithSAP="1">
            <Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3"
                            codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F"
                            scanType="progressive">
            </Representation>
            <Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3"
                            codecs="avc1.4D4028" scanType="progressive">
            </Representation>
            <Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028"
                            scanType="progressive">
            </Representation>
        </AdaptationSet>
        <AdaptationSet
                group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000"
                minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33"
                frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
            <Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
            </Representation>
        </AdaptationSet>
    </Period>
</MPD>

并在其上运行以下Python代码:

from lxml import etree

file = "Data.xml"
namespaces = {'ns':'urn:mpeg:dash:schema:mpd:2011'}

tree = etree.parse(file)
root = tree.getroot()

for r in root.xpath('//ns:AdaptationSet[@contentType="video"]',namespaces=namespaces):
    print etree.tostring(r)
    for bandwidth in r.xpath('//ns:Representation/@bandwidth',namespaces=namespaces):
        print bandwidth

我的问题是,第二个循环使用的不是xpath之前的结果,而是完整的树!这就是为什么结果还包括音频表示。具体情况如下:

<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="16:9" minBandwidth="501000" maxBandwidth="9001000" minWidth="512" maxWidth="1920" minHeight="288" maxHeight="1080" segmentAlignment="true" frameRate="25" mimeType="video/mp4" startWithSAP="1">
            <Representation id="video_eng=501000" bandwidth="501000" width="512" height="288" codecs="avc1.4D401E" scanType="progressive">
            </Representation>
            <Representation id="video_eng=851000" bandwidth="851000" width="640" height="360" codecs="avc1.4D401E" scanType="progressive">
            </Representation>
            <Representation id="video_eng=1302000" bandwidth="1302000" width="640" height="480" sar="4:3" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=2601000" bandwidth="2601000" width="1024" height="576" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=2701000" bandwidth="2701000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=3501000" bandwidth="3501000" width="1280" height="720" codecs="avc1.4D401F" scanType="progressive">
            </Representation>
            <Representation id="video_eng=6001000" bandwidth="6001000" width="1440" height="1080" sar="4:3" codecs="avc1.4D4028" scanType="progressive">
            </Representation>
            <Representation id="video_eng=9001000" bandwidth="9001000" width="1920" height="1080" codecs="avc1.4D4028" scanType="progressive">
            </Representation>
        </AdaptationSet>

157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000
<AdaptationSet xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" group="2" contentType="video" lang="en" par="20:11" minBandwidth="1901000" maxBandwidth="1901000" minWidth="872" maxWidth="872" segmentAlignment="true" width="720" height="480" sar="40:33" frameRate="25" mimeType="video/mp4" codecs="avc1.4D401F" startWithSAP="1">
            <Representation id="video_eng=1901000" bandwidth="1901000" scanType="progressive">
            </Representation>
        </AdaptationSet>

157405
501000
851000
1302000
2601000
2701000
3501000
6001000
9001000
1901000

因此,即使找到了正确的自适应,对于这两个迭代,也会处理完整的树。我知道我可以构建一个XPath来获取bandwith,但是我需要在前面使用AdaptionSet,并且希望在第二个循环中只使用第一个循环的结果。我该怎么做?你知道吗


Tags: idvideowidthengprogressiveheightdashcodecs
2条回答

尝试使用相对xpath-

for bandwidth in r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces):

.将使xpath从当前元素开始。如果没有指定.,正如您所观察到的,xpath将从根节点进行查询。你知道吗

必须在XPath的开头添加.,使其相对于当前上下文节点,在本例中,该节点由变量r引用:

r.xpath('.//ns:Representation/@bandwidth',namespaces=namespaces)

该行为在XPath 1.0 documentation中提到如下:

  • //para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node

  • .//para selects the para element descendants of the context node

相关问题 更多 >