有没有比用嵌套循环和ifs迭代更优雅的python方法从xml树中获取同一子树中的某些元素?你知道吗
即在伪SQL中
select UsageStatistic/PageViews/PerUser/Value from Tree where UsageStatistic/TimeRange/Days=7
以下是来自Alexa Amazon AWIS的XML响应的一个格式良好的子集:
<?xml version="1.0" encoding="utf-8"?>
<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><Response><OperationRequest><RequestId>dsfadf</RequestId></OperationRequest><UrlInfoResult><Alexa>
<TrafficData>
<DataUrl type="canonical">yahoo.com</DataUrl>
<UsageStatistics>
<UsageStatistic>
<TimeRange>
<Days>7</Days>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>111,200</Value>
<Delta>-0.49%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,442</Value>
<Delta>-1.71%</Delta>
</PerMillion>
<Rank>
<Value>7</Value>
<Delta>1</Delta>
</Rank>
<PerUser>
<Value>6.42</Value>
<Delta>-1.20%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>
<UsageStatistic>
<TimeRange>
<Days>3</Days>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>112,130</Value>
<Delta>-14.85%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,314</Value>
<Delta>-13.39%</Delta>
</PerMillion>
<Rank>
<Value>6</Value>
<Delta>0</Delta>
</Rank>
<PerUser>
<Value>7.99</Value>
<Delta>+1.4%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>
<UsageStatistic>
<TimeRange>
<Months>3</Months>
</TimeRange>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<Reach>
<Rank>
<Value>5</Value>
<Delta>0</Delta>
</Rank>
<PerMillion>
<Value>112,130</Value>
<Delta>-14.85%</Delta>
</PerMillion>
</Reach>
<PageViews>
<PerMillion>
<Value>11,314</Value>
<Delta>-13.39%</Delta>
</PerMillion>
<Rank>
<Value>6</Value>
<Delta>0</Delta>
</Rank>
<PerUser>
<Value>6.99</Value>
<Delta>+1.6%</Delta>
</PerUser>
</PageViews>
</UsageStatistic>
</UsageStatistics>
</TrafficData>
</Alexa></UrlInfoResult><aws:ResponseStatus><aws:StatusCode>Success</aws:StatusCode></aws:ResponseStatus></Response></aws:UrlInfoResponse>
这是我目前得到的密码。它以alexa的形式读入上面的响应XML文件_响应.xml你知道吗
import xml.etree.ElementTree as ET
prefix = "aws"
uri = "http://alexa.amazonaws.com/doc/2005-10-05/"
ET.register_namespace(prefix, uri)
tree = ET.parse('alexa_response.xml')
root = tree.getroot()
for a in root.iter("UsageStatistic"):
for b in a:
if b.tag == 'TimeRange':
for c in b:
print c.tag, c.text
if b.tag == 'PageViews':
for d in b:
if d.tag == 'PerUser':
for f in d:
if f.tag == 'Value':
print f.tag, f.text
print
结果:
Days 7
Value 6.42
Months 3
Value 6.99
我只需要
Days 7
Value 6.42
即来自TimeRange/Days/7所在的同一子树的PageViews/PerUser/Value/6.42。你知道吗
我想知道是否有更好的方法,然后通过大量嵌套循环和ifs进行迭代?你知道吗
感谢您对@max和@Parfait的评论和回答。我不得不修改一点,使之工作,所以不得不张贴它作为我自己的答案。你知道吗
可以使用单个XPath表达式执行此操作:
//UsageStatistic/PageViews/PerUser/Value[../../../TimeRange/Days=7]
相关问题 更多 >
编程相关推荐