如何用靓汤提取div的内容(图像)

2024-05-16 03:46:52 发布

您现在位置:Python中文网/ 问答频道 /正文

<div class="product_image clearfix"> <img src="https://res.sastasundar.com/incom/images/product/thumb/XPLOR-Dark-Chocolate-Brownie-1542880911-10051353-1.jpg" title="XPLOR Dark Chocolate Brownie 50 gm" class=" center-block"> </div>

用Python和靓汤

我找不到这个驾驶员

links = soup.find_all('div', attrs={'class': 'product_image clearfix'})

之后我必须提取图像


Tags: httpsimagedivsrccomimgresproduct
3条回答

你用的是什么版本的BeautifulSoup。您应该能够打印div的内容:

from bs4 import BeautifulSoup

html = """<div class="product_image clearfix">
  <img src="https://res.sastasundar.com/incom/images/product/thumb/XPLOR-Dark-Chocolate-Brownie-1542880911-10051353-1.jpg" title="XPLOR Dark Chocolate Brownie 50 gm" class=" center-block">
</div>"""

soup = BeautifulSoup(html, 'html.parser')

for div in soup.find_all('div', class_='product_image clearfix'):
  for img in div.find_all('img', recursive=False):
    print(img)

对于BS的当前版本,这应该起作用:

links = soup.find_all('div', class_='product_image clearfix')

全套是动态加载的。您可以提出与页面相同的请求

import requests

base = 'https://res.sastasundar.com/incom/images/product/'
r = requests.get('https://www.retailershakti.com/category/loadBrandListData?MfgGroup=&categoryId=1357&size=50&page=1').json()
images = [base + i['idata'][0]['ProductImage'] for i in r]
print(images)

相关问题 更多 >