Python pynav2包_程序模块 - PyPI

无头程序化网页浏览器在请求和靓汤之上

pynav2的Python项目详细描述

Pynav2

在请求和靓汤之上的无头编程Web浏览器

要求

Python3.4+

unittest从python 3.4测试到3.7

安装

如果python3是默认的python二进制文件

pip install pynav2

如果python2是默认的python二进制文件

pip3 install pynav2

许可证

GNU LGPLv3（GNU Lesser通用公共许可版本3）

交互模式示例

所有示例都需要

frompynav2importBrowserb=Browser()

http get请求并打印响应

获取http://example.com（如果服务器上可用，请使用https）

>>>b.get('example.com')<Response[200]>>>>b.text# alias for b.response.text'<!DOCTYPE html>\n<html lang="mul" class="no-js">\n<head>\n<meta charset="utf-8">\n<title>example.com</title>...'

http get请求并打印json响应

gethttp://example.com/user-agent/json如果不，则返回响应的json编码内容

>>>b.get('example.com/user-agent/json')<Response[200]>>>>b.json# alias for b.response.json(){'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0'}

http post请求并打印响应

>>>data={'q':'python'}>>>b.post('example.com/search',data=data)<Response[200]>>>>b.text'<!DOCTYPE html>\n<html lang="mul" class="no-js">\n<head>\n<meta charset="utf-8">\n<title>example.com</title>...'

http发布json请求并打印json响应

>>>importjson>>>data={'login':'user','password':'pass'}>>>b.post('example.com/login',json=json.dumps(data))# json to send in the body of the request<Response[200]>>>>b.json{'login':'success'}

http头请求和打印响应头

>>>b.head('example.com')<Response[200]>>>>b.response.headers{'Server':'nginx','Content-Type':'text/html; charset=utf-8','Content-Length':'48842','Age':'3154','Connection':'keep-alive'}

http put请求并打印json响应

>>>data={'version':'2.1','licence':'LGPL'}>>>b.put('example.com/api/about/',data=data)<Response[200]>>>>b.json{'update':'success'}

http补丁请求并打印json响应

>>>data={'version':'2.1'}>>>b.patch('example.com/api/about/',data=data)<Response[200]>>>>b.json{'patch':'success'}

http删除请求并打印json响应

>>>b.delete('example.com/api/user/102')<Response[200]>>>>b.json{'delete':'success'}

http选项请求并打印json响应

>>>b.options('example.com/api/user')<Response[200]>>>>b.json{'options':'...'}

获取所有链接

>>>b.get('example.com')<Response[200]>>>>b.links['http://example.com/news','http://example.com/forum','http://example.com/contact']>>>forlinkinb.links:...print(link)...http://example.com/newshttp://example.com/forumhttp://example.com/contact

过滤链接

可以添加任何beautifulsoup.find_all（）参数，请参见Beautiful Soup documentation

>>>importre>>>b.get('example.com')<Response[200]>>>>b.get_links(text='Python Events')# regular expression>>>b.get_links(class_="jump-link")# no regular expression for class attribute>>>b.get_links(href="windows")# regular expression>>>b.get_links(title=re.compile('success'))# manual regular expression

获取所有图像

>>>b.get('example.com')<Response[200]>>>>b.images['http://example.com/img/logo.png','http://example.com/img/picture.jpg','http://there.com/news.gif']

过滤图像

可以添加任何beautifulsoup.find_all（）参数，请参见Beautiful Soup documentation

>>>b.get('example.com')<Response[200]>>>>b.get_images(src='logo')# regular expression>>>b.get_images(class_='python-logo')# no regular expression for class attribute>>>b.get_images(alt='yth')# regular expression

下载文件

>>>b.verbose=True>>>b.download('http://example.com/ubuntu-amd64','/tmp')# it will follow redirect and look for headers content-disposition to find filenamedownloadingubuntu-18.04.1-desktop-amd64.iso(1.8GB)to:/tmp/ubuntu-18.04.1-desktop-amd64.isodownloadcompletedin12minutes5seconds(1.8GB)

处理引用程序

>>>b.handle_referer=True>>>b.get('somewhere.com')>>>b.get('example.com')# request headers will have http://somewhere.com as referer>>>b.get('there.com')# request headers will have http://example.com as referer

手动设置referer

>>>b.referer='http://www.here.com'>>>b.get('example.com')# request headers will have http://here.com as referer

设置用户代理

用户代理模块包括用户代理列表：

Firefox_Windows、Chrome_Windows、Edge_Windows、IE_Windows、Firefox_Linux、Chrome_Linux、Safari_Mac

默认用户代理是Firefox_Windows

>>>frompynav2importuseragent>>>b.user_agent=useragent.firefox_linux>>>b.get('example.com')# request headers will have 'Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0' as User-Agent>>>b.user_agent='my_app/v1.0'>>>b.get('example.com')# request headers will have my_app/v1.0 as User-Agent

设置请求前的睡眠时间

>>>b.set_sleep_time(0.5,1.5)# random x seconds between 0.5 to 1.5 seconds and wait x before each request>>>b.get('example.com')# wait x seconds before request

定义请求超时

10秒超时

>>>b.timeout=10

关闭所有打开的TCP会话

>>>b.get('example1.com')>>>b.get('example2.com')>>>b.get('example3.com')>>>b.session.close()

为一个请求设置使用https请求的http代理

袜子代理见Requests documentation

>>>b.get('https://httpbin.org/ip').json()['origin']111.111.111.111>>>proxies={'https':'10.0.0.0:1234'}>>>b.timeout=10# could be useful to wait 10 seconds if proxies are slow>>>b.get('https://httpbin.org/ip',proxies=proxies).json()['origin']10.0.0.0

为所有请求设置使用https请求的http代理

袜子代理见Requests documentation

>>>b.get('https://httpbin.org/ip').json()['origin']111.111.111.111>>>b.proxies={'https':'10.0.0.0:1234'}>>>b.timeout=10# could be useful to wait 10 seconds if proxies are slow>>>b.get('https://httpbin.org/ip').json()['origin']10.0.0.0

为所有请求设置使用https请求的http代理，为特定域设置另一个代理

袜子代理见Requests documentation

>>>b.get('https://httpbin.org/ip').json()['origin']111.111.111.111>>>b.proxies={'https':'10.0.0.0:1234','https://specific-domain.com':'10.11.12.13:1234'}>>>b.timeout=10# could be useful to wait 10 seconds if proxies are slow>>>b.get('https://httpbin.org/ip').json()['origin']10.0.0.0>>>b.get('https://specific-domain.com/ip').json()['origin']10.11.12.13

获取美化组实例

在GET或POST请求之后，browser.bs（beautifulsoup）将自动启动，并带有b.response.text

见Beautifll Soup documentation

>>>b.get('example.com')>>>b.bs.find_all('a')

获取请求对象实例

见Requests documentation

>>>b.get('example.com')>>>b.session>>>b.request>>>b.response

获取浏览器历史记录

>>>b.get('example1.com')>>>b.get('example2.com')>>>b.get('example3.com')>>>printb.history['example1.com','example2.com','example3.com']

禁用“不安全请求警告：正在发出未验证的https请求”

>>>importurllib3>>>urllib3.disable_warnings()>>>b.get('example.com')# no warnings

欢迎加入QQ群-->： 979659372

pynav2 2.1

pynav2的Python项目详细描述

Pynav2

在请求和靓汤之上的无头编程Web浏览器

要求

安装

许可证

交互模式示例

http get请求并打印响应

http get请求并打印json响应

http post请求并打印响应

http发布json请求并打印json响应

http头请求和打印响应头

http put请求并打印json响应

http补丁请求并打印json响应

http删除请求并打印json响应

http选项请求并打印json响应

获取所有链接 >>>b.get('example.com')<Response[200]>>>>b.links['http://example.com/news','http://example.com/forum','http://example.com/contact']>>>forlinkinb.links:...print(link)...http://example.com/newshttp://example.com/forumhttp://example.com/contact

过滤链接

获取所有图像 >>>b.get('example.com')<Response[200]>>>>b.images['http://example.com/img/logo.png','http://example.com/img/picture.jpg','http://there.com/news.gif']

过滤图像

下载文件

处理引用程序

手动设置referer >>>b.referer='http://www.here.com'>>>b.get('example.com')# request headers will have http://here.com as referer

设置请求前的睡眠时间 >>>b.set_sleep_time(0.5,1.5)# random x seconds between 0.5 to 1.5 seconds and wait x before each request>>>b.get('example.com')# wait x seconds before request

定义请求超时

关闭所有打开的TCP会话

为一个请求设置使用https请求的http代理

为所有请求设置使用https请求的http代理

为所有请求设置使用https请求的http代理，为特定域设置另一个代理

获取美化组实例

获取请求对象实例 见Requests documentation>>>b.get('example.com')>>>b.session>>>b.request>>>b.response

获取浏览器历史记录 >>>b.get('example1.com')>>>b.get('example2.com')>>>b.get('example3.com')>>>printb.history['example1.com','example2.com','example3.com']

禁用“不安全请求警告：正在发出未验证的https请求”

推荐PyPI第三方库

odoo11-addons-oca-product-attribute

scientific-string

sldp

pymatuning

nester-T

valiot-worker

cloudberry-netjson

pysilverpop

var_control

wc-rules

swing

flasklimiter

txlog

stochsearch

sendgriddjango

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

获取所有链接
>>>b.get('example.com')<Response[200]>>>>b.links['http://example.com/news','http://example.com/forum','http://example.com/contact']>>>forlinkinb.links:...print(link)...http://example.com/newshttp://example.com/forumhttp://example.com/contact

获取所有图像
>>>b.get('example.com')<Response[200]>>>>b.images['http://example.com/img/logo.png','http://example.com/img/picture.jpg','http://there.com/news.gif']

手动设置referer
>>>b.referer='http://www.here.com'>>>b.get('example.com')# request headers will have http://here.com as referer

设置请求前的睡眠时间
>>>b.set_sleep_time(0.5,1.5)# random x seconds between 0.5 to 1.5 seconds and wait x before each request>>>b.get('example.com')# wait x seconds before request

获取请求对象实例
见Requests documentation
>>>b.get('example.com')>>>b.session>>>b.request>>>b.response

获取浏览器历史记录
>>>b.get('example1.com')>>>b.get('example2.com')>>>b.get('example3.com')>>>printb.history['example1.com','example2.com','example3.com']

导航栏

项目链接

标签