漂亮的一组,1个元素有2个相同的链接,怎么只打印1个?

2024-04-30 03:09:11 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨, 运行此代码后:

import requests
from bs4 import BeautifulSoup


page = requests.get('https://coinpaprika.com')
soup = BeautifulSoup(page.text, 'html.parser')

coin_list = soup.find('tbody')
coin_list_items = coin_list.find_all('a')

for coin_name in coin_list_items:
    names = coin_name.string
    links = 'https://coinpaprika.com' + coin_name.get('href')
    print(names)
    print(links)

程序打印:

None
https://coinpaprika.com/coin/btc-bitcoin/
Bitcoin
https://coinpaprika.com/coin/btc-bitcoin/
None
https://coinpaprika.com/coin/xrp-xrp/
XRP
https://coinpaprika.com/coin/xrp-xrp/
None
https://coinpaprika.com/coin/eth-ethereum/
Ethereum
https://coinpaprika.com/coin/eth-ethereum/

而不是:

Bitcoin
https://coinpaprika.com/coin/btc-bitcoin/
XRP
https://coinpaprika.com/coin/xrp-xrp/
Ethereum
https://coinpaprika.com/coin/eth-ethereum/

我知道原因是:

<td class="table__fixed-cell">
                    <a href="/coin/btc-bitcoin/"><span class="coin-icon currency_images-0"></span></a>
                </td>


<td class="table__fixed-cell">
                    <a href="/coin/btc-bitcoin/">Bitcoin</a>
                    <small>BTC</small>
                </td>

但我仍然不知道如何只打印第二张。 有人能帮我吗?


Tags: namehttpscomnonebitcoinlisttdeth
2条回答

只需找到包含文本的标签。你知道吗

coin_list_items = coin_list.find_all('a',text=True)

有些链接的锚文本为空,因为它用于图标图像

<a href="/coin/btc-bitcoin/"><span class="coin-icon currency_images-0"></span></a>

添加支票

for coin_name in coin_list_items:
    names = coin_name.string
    if not names:
      continue
    links = 'https://coinpaprika.com' + coin_name.get('href')
    print(names)
    print(links)

相关问题 更多 >