用Pythonic方法在同一子树中用另一个elemen获取XML树元素

2024-05-14 22:07:09 发布

您现在位置:Python中文网/ 问答频道 /正文

有没有比用嵌套循环和ifs迭代更优雅的python方法从xml树中获取同一子树中的某些元素?你知道吗

即在伪SQL中

select UsageStatistic/PageViews/PerUser/Value from Tree where UsageStatistic/TimeRange/Days=7 

以下是来自Alexa Amazon AWIS的XML响应的一个格式良好的子集:

<?xml version="1.0" encoding="utf-8"?>
<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><Response><OperationRequest><RequestId>dsfadf</RequestId></OperationRequest><UrlInfoResult><Alexa>
<TrafficData>
<DataUrl type="canonical">yahoo.com</DataUrl>
<UsageStatistics>

<UsageStatistic>
<TimeRange>
<Days>7</Days>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>111,200</Value>
<Delta>-0.49%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,442</Value>
<Delta>-1.71%</Delta>
</PerMillion>
<Rank>
<Value>7</Value>
<Delta>1</Delta>
</Rank>
<PerUser>
<Value>6.42</Value>
<Delta>-1.20%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>

<UsageStatistic>
<TimeRange>
<Days>3</Days>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>112,130</Value>
<Delta>-14.85%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,314</Value>
<Delta>-13.39%</Delta>
</PerMillion>
<Rank>
<Value>6</Value>
<Delta>0</Delta>
</Rank>
<PerUser>
<Value>7.99</Value>
<Delta>+1.4%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>

<UsageStatistic>
<TimeRange>
<Months>3</Months>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>112,130</Value>
<Delta>-14.85%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,314</Value>
<Delta>-13.39%</Delta>
</PerMillion>
<Rank>
<Value>6</Value>
<Delta>0</Delta>
</Rank>
<PerUser>
<Value>6.99</Value>
<Delta>+1.6%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>

</UsageStatistics>
</TrafficData>
</Alexa></UrlInfoResult><aws:ResponseStatus><aws:StatusCode>Success</aws:StatusCode></aws:ResponseStatus></Response></aws:UrlInfoResponse>

这是我目前得到的密码。它以alexa的形式读入上面的响应XML文件_响应.xml你知道吗

import xml.etree.ElementTree as ET
prefix = "aws"
uri = "http://alexa.amazonaws.com/doc/2005-10-05/"
ET.register_namespace(prefix, uri)
tree = ET.parse('alexa_response.xml')
root = tree.getroot()
for a in root.iter("UsageStatistic"):
    for b in a:
        if b.tag == 'TimeRange':
            for c in b: 
                print c.tag, c.text
        if b.tag == 'PageViews':
            for d in b: 
                if d.tag == 'PerUser':
                    for f in d:
                        if f.tag == 'Value':
                            print f.tag, f.text
    print

结果:

Days 7
Value 6.42

Months 3
Value 6.99

我只需要

Days 7 
Value 6.42

即来自TimeRange/Days/7所在的同一子树的PageViews/PerUser/Value/6.42。你知道吗

我想知道是否有更好的方法,然后通过大量嵌套循环和ifs进行迭代?你知道吗


Tags: inawsforvaluetagxmldaysdelta
2条回答

感谢您对@max和@Parfait的评论和回答。我不得不修改一点,使之工作,所以不得不张贴它作为我自己的答案。你知道吗

prefix = "aws"
uri = "http://alexa.amazonaws.com/doc/2005-10-05/"
import lxml.etree as lET
lET.register_namespace(prefix, uri)
doc=lET.parse('alexa_response.xml')
doc_root=doc.getroot()
for value in doc_root.xpath('.//UsageStatistic[TimeRange/Days="7"]/PageViews/PerUser/Value'):
    print value.text

可以使用单个XPath表达式执行此操作:

//UsageStatistic/PageViews/PerUser/Value[../../../TimeRange/Days=7]

相关问题 更多 >

    热门问题