如何在靓汤中导航特定标签？

import bs4 as bs import urllib source = urllib.urlopen("https://taripebi.ge/%E1%83%91%E1%83%94%E1%83%9C%E1%83%96%E1%83%98%E1%83%9C%E1%83%98%E1%83%A1-%E1%83%A4%E1%83%90%E1%83%A1%E1%83%94%E1%83%91%E1%83%98").read() soup = bs.BeautifulSoup(source, 'lxml') for paragraph in soup.find('div', style = "width: 40%;/* float: left; */"): print(paragraph)

1条回答

网友

1楼 · 发布于 2024-04-20 06:25:53

每次运行代码都会得到不同的输出。

是的。每次页面返回不同的结果时。即使你的选择是错误的，这并不能解释你得到不同的结果打印每一次。我运行了几次，每次都得到不同的结果。你知道吗

from bs4 import BeautifulSoup
import requests
import pandas as pd
r = requests.get("https://taripebi.ge/%E1%83%91%E1%83%94%E1%83%9C%E1%83%96%E1%83%98%E1%83%9C%E1%83%98%E1%83%A1-%E1%83%A4%E1%83%90%E1%83%A1%E1%83%94%E1%83%91%E1%83%98")
df=pd.read_html(r.text)
print(df)

输出

1号跑道

[    0       1       2       3       4       5        6      7
0 NaN    -00  2.4992  2.5700    2.64    2.63  2.59100   -00
1 NaN    -00  2.3593  2.4800    2.58    -00     2.53   -00
2 NaN    -00  2.0493  2.2495    -00  2.0500   2.2400   -00
3 NaN    -00  2.4300  2.5300    2.63  2.4510     2.58   -00
4 NaN  2.3593  2.4100  2.4900  2.6300  2.4910     2.59   -00
5 NaN    -00  2.1593  2.4295    -00  2.2010   2.4500   -00
6 NaN  2.0400  2.1493  2.2495    -00    2.05     -00   2.24]

运行2

[    0       1       2       3       4       5        6      7
0 NaN    -00  2.3593  2.4800    2.58    -00     2.53   -00
1 NaN    -00  2.4300  2.5300    2.63  2.4510     2.58   -00
2 NaN    -00  2.1593  2.4295    -00  2.2010   2.4500   -00
3 NaN  2.3593  2.4100  2.4900  2.6300  2.4910     2.59   -00
4 NaN  2.0400  2.1493  2.2495    -00    2.05     -00   2.24
5 NaN    -00  2.4992  2.5700    2.64    2.63  2.59100   -00
6 NaN    -00  2.0493  2.2495    -00  2.0500   2.2400   -00]

理想情况下，根据您的代码，每次运行代码时都应该得到2.41的结果（在问题中给出）。你知道吗

发生的情况是，这个页面在后台执行一些javascript授权，然后才填充有效数据。你知道吗

对于这些类型的页面，最好使用selenium。你知道吗

from selenium import webdriver
from time import sleep
from bs4 import BeautifulSoup
driver = webdriver.Firefox()
driver.get('https://taripebi.ge/%E1%83%91%E1%83%94%E1%83%9C%E1%83%96%E1%83%98%E1%83%9C%E1%83%98%E1%83%A1-%E1%83%A4%E1%83%90%E1%83%A1%E1%83%94%E1%83%91%E1%83%98')
source = driver.page_source
soup =BeautifulSoup(source, 'lxml')
for paragraph in soup.find('div', style = "width: 40%;/* float: left; */"):
    print(paragraph)

输出

运行1

2.41

运行2

2.41

相关问题更多 >

编程相关推荐

热门问题

热门文章