如何在网页上搜索推荐

3条回答

网友

1楼 · 编辑于 2024-05-16 03:09:11

您不能使用简单的scrapy spider访问这些数据，因为页面是通过JS呈现的。您可以尝试禁用浏览器的JS并刷新页面。你会看到一个空白页。如果你检查它，你会发现没有与产品相关的数据。你知道吗

如果您想刮取这种类型的JS呈现页面，我建议您使用splash和scrapy-splash。它有很好的文档记录并且易于使用。它是一个渲染服务，允许您刮取所需的数据。（它得到了scrapinghub的支持，scrapy背后的聪明头脑）。你知道吗

网友

2楼 · 编辑于 2024-05-16 03:09:11

我不熟悉Python，但您的XPath无法匹配。试试//div[contains(@class, "product-tile-container")]//a//img/@src。单斜杠表示元素是前一个元素的直接子元素。双斜杠意味着您希望所提到的元素位于当前元素的层次结构中的某个位置。你知道吗

如果使用类product-image-container：//div[contains(@class, "product-tile-container")]//a/div[contains(@class, 'product-image-container')]//img/@src为任何div添加额外的路径检查，可以使XPath更加健壮

我强烈建议您使用一个插件来检查XPath，例如https://chrome.google.com/webstore/detail/xpath-helper/hgimnogjllphhhkhlmebbmlgjoejdpjl

网友

3楼 · 编辑于 2024-05-16 03:09:11

您可以使用selenium滚动到页面底部。但是，网站仍然需要一段时间才能加载建议。因此，此解决方案通过使用while循环等待产品建议部分出现：

from selenium import webdriver
from bs4 import BeautifulSoup as soup
import time
d = webdriver.Chrome('/Users/jamespetullo/Downloads/chromedriver')
d.get('https://www.michaelkors.com/logo-tape-ribbed-stretch-viscose-sweater/_/R-US_MH86NXK5ZW')
last_height = d.execute_script("return document.body.scrollHeight")
while True:
   d.execute_script("window.scrollTo(0, document.body.scrollHeight);")
   time.sleep(0.5)
   new_height = d.execute_script("return document.body.scrollHeight")
   if new_height == last_height:
     break
   last_height = new_height

start = soup(d.page_source, 'html.parser')
while start.find('div', {'class':'product-tile-rfk'}) is None:
   start = soup(d.page_source, 'html.parser')

products = [i.find_all('li', {'class':'product-name-container'})[0].text for i in start.find_all('div', {'class':'product-tile-rfk'})]

输出：

['Ribbed Stretch-Viscose Tank', 'Ribbed Stretch-Viscose Tank Top', 'Ribbed Stretch-Viscose Tank Top', 'Stretch-Viscose Tank', 'Striped Ribbed Sweater Tank', 'Tie-Dye Stretch-Viscose Sweater', 'Striped Stretch-Viscose Tank', 'Striped Stretch-Cotton Sweater', 'Rainbow Stretch-Viscose Short-Sleeve Sweater', 'Stretch-Viscose Cropped Tank', 'Neon Striped Stretch-Viscose Tank Top', 'Geometric Grid Stretch-Viscose Top', 'Logo Tape Stretch-Viscose Pullover', 'Logo Tape Stretch-Viscose Cropped T-Shirt', 'Logo Tape Cotton-Jersey Top', 'Logo Tape Viscose Joggers', 'Logo Tape Buttoned Track Pants', 'Contrast Stripe Joggers', 'Contrast Stripe Hooded Jacket', 'Logo Tape Stretch-Viscose Pencil Skirt', 'Logo Tape Stretch-Viscose Zip-Up Hoodie', 'Stretch-Viscose Joggers', 'Cotton Asymmetric Turtleneck', 'Logo Tape Ribbed Knit Dress']

相关问题更多 >

编程相关推荐

热门问题

热门文章