获取亚马逊价格的Xpath

1 投票

2 回答

2915 浏览

提问于 2025-04-18 00:02

首先，这是网址：

http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031

上面是一个在 www.amazon.in 上的产品页面链接。我想获取实际价格，价格是 Rs.4,094。下面是一个 Python 代码，它试图打印出这个价格。我使用了 //span[@id="actualPriceValue"]/text() 来获取价格，但它返回的是一个空列表。有没有人能建议我怎么才能获取到这个价格呢？

from lxml import html
import requests

page = requests.get('http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031')
tree = html.fromstring(page.text)
price = tree.xpath('//span[@id="actualPriceValue"]/text()')

print price

数据提取 xpath 网页抓取网络爬虫亚马逊

2 个回答

我觉得问题在于，id为actualPriceValue的span标签里面没有直接的文本内容。你可能需要这样做（这只是我随便想的，你可能需要根据实际情况调整一下）：

补充：已经修复。下面的解释仍然是准确的。

//*[@id='actualPriceValue']/b/span/text()

你会注意到，HTML看起来是这样的：

<span id="actualPriceValue">
    <b class="priceLarge">
       <span style="text-decoration: inherit; white-space: nowrap;">
           <span class="currencyINR">&nbsp;&nbsp;</span>
           <span class="currencyINRFallback" style="display:none">Rs. </span>
           4,112.00
       </span>
    </b>
</span>

你会发现它应该是：

Span with an id of actualPriceValue -> first b element -> first span element -> text

回答于 2025-04-18 由 Python大师

分享举报

使用下面的XPath：

price = tree.xpath("//*[@id='actualPriceValue']/b/span/text()")[0]

下面的代码是可以用的：

from lxml import html
import requests

page = requests.get('http://www.amazon.in/gp/product/B00EYCBFDQ/ref=s9_pop_gw_g147_i3?pf_rd_m=A1VBAL9TL5WCBF&pf_rd_s=center-3&pf_rd_r=1YP3T548XBFHJ1RA3EH8&pf_rd_t=101&pf_rd_p=402518447&pf_rd_i=1320006031')
tree = html.fromstring(page.text)
price = tree.xpath("//*[@id='actualPriceValue']/b/span/text()")[0]

print price

结果：

4,094.00
[Finished in 3.0s]

如果这对你有帮助，请告诉我们。

回答于 2025-04-18 由 Python大师

分享举报

获取亚马逊价格的Xpath

2 个回答

撰写回答