使用selenium和beautifulsoup进行网页抓取

2024-04-23 09:35:40 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图在web上搜索grofer和bigbasket信息，但我在使用findAll（）函数时遇到了问题。当我使用len（imgList）时，长度总是返回0。它总是显示空列表如何解决它？有人能帮我吗？我在格罗夫得到了staus代码403

from bs4 import BeautifulSoup
url = 'https://grofers.com/cn/grocery-staples/cid/16'
driver = webdriver.Chrome(r'C:\Users\HP\data\chromedriver.exe')
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
data = soup.findAll('plp-product__name')
print(data)

from bs4 import BeautifulSoup
response = requests.get('https://grofers.com/cn/grocery-staples/cid/16')
response
content = response.content
data = BeautifulSoup(content,'html5lib')
read = data.findAll('plp-product__name ')
read```

在输出中，我得到： []

Tags： from https import com url data response html

1条回答

网友

1楼 · 发布于 2024-04-23 09:35:40

你没有包括在内

from selenium import webdriver 
driver = webdriver.Chrome(executable_path=r'C:\Users\HP\data\chromedriver.exe')

试一试

data = soup.select('div.plp-product__name ')

或者

data = soup.find_all("div",class_="plp-product__name")

注意，正确的方法是find_all而不是findAll，因为它在bs4库中被弃用

使用selenium和beautifulsoup进行网页抓取

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用selenium和beautifulsoup进行网页抓取

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >