使用BeatifulSoup从URL获取XML数据并输出到字典

2024-04-19 16:29:46 发布

您现在位置:Python中文网/ 问答频道 /正文

在这里我需要从URL(汇率列表)读取XML数据,输出是字典…现在我只能得到第一种货币…尝试用find\u all但没有成功。。。 有人能评论一下我需要把for循环放到哪里去读取所有的值。。。你知道吗

import bs4 as bs
import urllib.request

source urllib.request.urlopen('http://www.xxxy.hr/Downloads/PBZteclist.xml').read()
soup = bs.BeautifulSoup(source,'xml')

name = soup.find('Name').text
unit = soup.find('Unit').text
buyratecache = soup.find('BuyRateCache').text
buyrateforeign = soup.find('BuyRateForeign').text
meanrate = soup.find('MeanRate').text
sellrateforeign = soup.find('SellRateForeign').text
sellratecache = soup.find('SellRateCache').text


devize =  {'naziv_valute': '{}'.format(name),
           'jedinica': '{}'.format(unit),
           'kupovni': '{}'.format(buyratecache),
           'kupovni_strani': '{}'.format(buyrateforeign),
           'srednji': '{}'.format(meanrate),
           'prodajni_strani': '{}'.format(sellrateforeign),
           'prodajni': '{}'.format(sellratecache)}

print ("devize:",devize)

XML示例:

<ExchRates>
    <ExchRate>
        <Bank>Privredna banka Zagreb</Bank>
        <CurrencyBase>HRK</CurrencyBase>
        <Date>12.01.2019.</Date>
        <Currency Code="036">
            <Name>AUD</Name>
            <Unit>1</Unit>
            <BuyRateCache>4,485390</BuyRateCache>
            <BuyRateForeign>4,530697</BuyRateForeign>
            <MeanRate>4,646869</MeanRate>
            <SellRateForeign>4,786275</SellRateForeign>
            <SellRateCache>4,834138</SellRateCache>
        </Currency>
        <Currency Code="124">
            <Name>CAD</Name>
            <Unit>1</Unit>
            <BuyRateCache>4,724225</BuyRateCache>
            <BuyRateForeign>4,771944</BuyRateForeign>
            <MeanRate>4,869331</MeanRate>
            <SellRateForeign>4,991064</SellRateForeign>
            <SellRateCache>5,040975</SellRateCache>
        </Currency>
        <Currency Code="203">
            <Name>CZK</Name>
            <Unit>1</Unit>
            <BuyRateCache>0,280057</BuyRateCache>
            <BuyRateForeign>0,284322</BuyRateForeign>
            <MeanRate>0,290124</MeanRate>
            <SellRateForeign>0,297377</SellRateForeign>
            <SellRateCache>0,300351</SellRateCache>
        </Currency>
        ...etc...
    </ExchRate>
</ExchRates>

Tags: textnameformatcodeunitxmlfindcurrency
1条回答
网友
1楼 · 发布于 2024-04-19 16:29:46

只需遍历所有Currency节点(而不是soup对象),甚至使用列表理解来构建字典列表:

soup = bs.BeautifulSoup(source, 'xml')

# ALL EXCHANGE RATE NODES
curency_nodes = soup.findAll('Currency')

# LIST OF DICTIONAIRES
devize_list = [{'naziv_valute': c.find('Name').text,
                'jedinica': c.find('Unit').text,
                'kupovni': c.find('BuyRateCache').text,
                'kupovni_strani': c.find('BuyRateForeign').text,
                'srednji': c.find('MeanRate').text,
                'prodajni_strani': c.find('SellRateForeign').text,
                'prodajni': c.find('SellRateCache').text
               } for c in curency_nodes]

或者,由于要提取所有元素,请合并词典理解:

devize_list = [{n.name: n.text} for c in currency_nodes \
                                    for n in c.children if n.name is not None ]

相关问题 更多 >