硒/幻影网络捕捉

from selenium import webdriver from browsermobproxy import Server server = Server(<path to browsermob-proxy>) server.start() proxy = server.create_proxy({'captureHeaders': True, 'captureContent': True, 'captureBinaryContent': True}) service_args = ["--proxy=%s" % proxy.proxy, '--ignore-ssl-errors=yes'] driver = webdriver.PhantomJS(service_args=service_args) proxy.new_har() driver.get('https://google.com') print(proxy.har) # this is the archive # for example: all_requests = [entry['request']['url'] for entry in proxy.har['log']['entries']]

2条回答

网友

1楼 · 编辑于 2024-06-01 00:31:30

我在用一个代理

from selenium import webdriver
from browsermobproxy import Server

server = Server(environment.b_mob_proxy_path)
server.start()
proxy = server.create_proxy()
service_args = ["--proxy-server=%s" % proxy.proxy]
driver = webdriver.PhantomJS(service_args=service_args)

proxy.new_har()
driver.get('url_to_open')
print proxy.har  # this is the archive
# for example:
all_requests = [entry['request']['url'] for entry in proxy.har['log']['entries']]

“har”（http存档格式）有很多关于请求和响应的其他信息，对我非常有用

在Linux上安装：

pip install browsermob-proxy

网友

2楼 · 编辑于 2024-06-01 00:31:30

我使用一个没有代理服务器的解决方案。为了添加executePhantomJS函数，我根据下面的链接修改了selenium源代码。

https://github.com/SeleniumHQ/selenium/pull/2331/files

然后在获取phantomJS驱动程序后执行以下脚本：

from selenium.webdriver import PhantomJS

driver = PhantomJS()

script = """
    var page = this;
    page.onResourceRequested = function (req) {
        console.log('requested: ' + JSON.stringify(req, undefined, 4));
    };
    page.onResourceReceived = function (res) {
        console.log('received: ' + JSON.stringify(res, undefined, 4));
    };
"""

driver.execute_phantomjs(script)
driver.get("http://ariya.github.com/js/random/")
driver.quit()

然后所有请求都记录在控制台中（通常是ghostdriver.log文件）

相关问题更多 >

编程相关推荐

热门问题

热门文章