列表理解返回最高值

2024-04-19 01:51:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我要报废一个网页,我需要知道有多少页要报废。具体如下:

<div class="pagination">
    <a href="/travel__world-desktop-wallpapers/page/2">2</a>
    <a href="/travel__world-desktop-wallpapers/page/3">3</a>
    <a href="/travel__world-desktop-wallpapers/page/4">4</a>
    ...
    <a href="/travel__world-desktop-wallpapers/page/31">31</a>
    <a href="/travel__world-desktop-wallpapers/page/32">32</a>
    <a href="/travel__world-desktop-wallpapers/page/33">33</a>
    <a href="/travel__world-desktop-wallpapers/page/2">Next »</a>
</div>

如何设置一个返回最多页数(在本例中为33页)的列表理解?你知道吗


Tags: div网页列表worldpagepaginationclassnext
1条回答
网友
1楼 · 发布于 2024-04-19 01:51:53

不需要。而是设置生成器表达式:

max(int(link.text) 
    for link in soup.find('div', class_='pagination').find_all('a')
    if link.text.strip().isdigit())

演示:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <div class="pagination">
...     <a href="/travel__world-desktop-wallpapers/page/2">2</a>
...     <a href="/travel__world-desktop-wallpapers/page/3">3</a>
...     <a href="/travel__world-desktop-wallpapers/page/4">4</a>
...     ...
...     <a href="/travel__world-desktop-wallpapers/page/31">31</a>
...     <a href="/travel__world-desktop-wallpapers/page/32">32</a>
...     <a href="/travel__world-desktop-wallpapers/page/33">33</a>
...     <a href="/travel__world-desktop-wallpapers/page/2">Next »</a>
... </div>
... ''')
>>> max(int(link.text) for link in soup.find('div', class_='pagination').find_all('a') if link.text.strip().isdigit())
33

相关问题 更多 >