如何用python从xml树中提取值?

2024-04-20 06:30:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个api查询,它返回下面的xml树,我想从中提取某些值。特别是,我想拉信息,如LinkedInCount。你知道吗

<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11">
<aws:OperationRequest>
<aws:RequestId>5486794a-0d03-4d47-a45b-e95764c3f0ee</aws:RequestId><
/aws:OperationRequest>
<aws:UrlInfoResult>
<aws:Alexa>

  <aws:ContentData>
    <aws:DataUrl type="canonical">yahoo.com/</aws:DataUrl>
    <aws:Asin>B00006D2TC</aws:Asin>
    <aws:SiteData>
      <aws:Title>Yahoo!</aws:Title>
      <aws:Description>Personalized content and search options. Chatrooms, free e-mail, clubs, and pager.</aws:Description>
      <aws:OnlineSince>18-Jan-1995</aws:OnlineSince>
    </aws:SiteData>
    <aws:Speed>
      <aws:MedianLoadTime>2242</aws:MedianLoadTime>
      <aws:Percentile>51</aws:Percentile>
    </aws:Speed>
    <aws:AdultContent>no</aws:AdultContent>
    <aws:Language>
      <aws:Locale>en</aws:Locale>
    </aws:Language>
    <aws:LinksInCount>76894</aws:LinksInCount>
    <aws:OwnedDomains>
      <aws:OwnedDomain>
        <aws:Domain>yahooligans.com</aws:Domain>
        <aws:Title>yahooligans.com</aws:Title>
      </aws:OwnedDomain>
    </aws:OwnedDomains>
  </aws:ContentData>

  <aws:Related>
    <aws:DataUrl type="canonical">yahoo.com/</aws:DataUrl>
    <aws:Asin>B00006D2TC</aws:Asin>
    <aws:RelatedLinks>
      <aws:RelatedLink>
        <aws:DataUrl type="canonical">aol.com/</aws:DataUrl>
        <aws:NavigableUrl>http://aol.com/</aws:NavigableUrl>
        <aws:Asin>B00006ARD3</aws:Asin>
        <aws:Relevance>301</aws:Relevance>
      </aws:RelatedLink>
    </aws:RelatedLinks>
    <aws:Categories>
      <aws:CategoryData>
        <aws:Title>On the Web/Web Portals</aws:Title>
        <aws:AbsolutePath>Top/Computers/Internet/On_the_Web/Web_Portals</aws:AbsolutePath>
      </aws:CategoryData>
    </aws:Categories>
  </aws:Related>        

  <aws:TrafficData>
    <aws:DataUrl type="canonical">yahoo.com/</aws:DataUrl>
    <aws:Asin>B00006D2TC</aws:Asin>
    <aws:Rank>1</aws:Rank>
    <aws:UsageStatistics>

      <aws:UsageStatistic>
        <aws:TimeRange>
          <aws:Days>1</aws:Days>
        </aws:TimeRange>
        <aws:Rank>
          <aws:Value>1</aws:Value>
          <aws:Delta>+0</aws:Delta>
        </aws:Rank>
        <aws:Reach>
          <aws:Rank>
            <aws:Value>2</aws:Value>
            <aws:Delta>+0</aws:Delta>
          </aws:Rank>
          <aws:PerMillion>
            <aws:Value>252,500</aws:Value>
            <aws:Delta>-1%</aws:Delta>
          </aws:PerMillion>
        </aws:Reach>
        <aws:PageViews>
          <aws:PerMillion>
            <aws:Value>51,400</aws:Value>
            <aws:Delta>-1%</aws:Delta>
          </aws:PerMillion>
          <aws:Rank>
            <aws:Value>1</aws:Value>
            <aws:Delta>+0</aws:Delta>
          </aws:Rank>
          <aws:PerUser>
            <aws:Value>13.7</aws:Value>
            <aws:Delta>-1%</aws:Delta>
          </aws:PerUser>
        </aws:PageViews>
      </aws:UsageStatistic>

    </aws:UsageStatistics>
  </aws:TrafficData>

</aws:Alexa>
</aws:UrlInfoResult>
<aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:StatusCode>Success</aws:StatusCode>
</aws:ResponseStatus>
</aws:Response>
</aws:UrlInfoResponse> 

一旦我得到了“树”,我就可以用以下代码得到响应:

elem = tree.find("//{http://alexa.amazonaws.com/doc/2005-10-05/}StatusCode")
print elem.text

但是,我不知道如何获取包含的LinksInCount

 <aws:LinksInCount>76894</aws:LinksInCount>

我试过以下方法:

elem = tree.find("//{http://alexa.amazonaws.com/doc/2005-10-05/}LinksInCount")
print elem.text


elem = tree.find("LinksInCount")
print elem.text

http://docs.aws.amazon.com/AlexaWebInfoService/latest/


Tags: comawshttpdoctitlevaluetypealexa