使用Python lxml进行数据刮取返回adblocker值 - 问答 - Python中文网

使用Python lxml进行数据刮取返回adblocker值

2024-06-07 23:31:34 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我目前正在HTML抓取一些数据，从一个我在Discord中创建的机器人的网页。我以前使用lxml成功地从另一个网站抓取HTML，但是，我现在尝试抓取的网站检测到adblocker，因此无论我尝试抓取什么数据，我都会收到相同的值

我的代码如下 `导入系统从lxml导入html 导入请求

def主（arg）：页码=请求。获取（“https://fortnitetracker.com/profile/pc/”+arg）树=html.fromstring(页码.内容)你知道吗

killdeath = tree.xpath('//div[@class="stats">K/d]/text()')
print(killdeath)`

我得到的价值是 '\nPlease consider adding Fortnite Tracker to your adblock whitelist! Our ads support the development and hardware costs of running this site. Really hate ads? Become a

Tags：数据代码网页网站系统 html arg 机器人

2条回答

网友

1楼 · 编辑于 2024-06-07 23:31:34

网站上写着：

To make use of our APIs we require you to use an API Key. To use the API key you need to pass it along as a header with your requests.

您是否将标题添加到请求中？我还建议您在postman或类似的应用程序中发出请求，这样您就可以看到整个响应。你知道吗

网友

2楼 · 编辑于 2024-06-07 23:31:34

可能发生的是，你最初得到的页面实际上只有“请考虑……”文本，还有一堆JavaScript，实际上加载了你所看到的内容。（试着打印出来页码.内容看看你到底得到了什么。）

无论如何，因为requests库不是一个成熟的web浏览器，它不会执行JavaScript，所以您只会看到adblocker消息。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章