抓取一个需要他们给你一个会话cookie firs的网页

import urllib2 import cookielib url = 'http://nrega.ap.gov.in/Nregs/FrontServlet?requestType=HouseholdInf_engRH&hhid=192420317026010002&actionVal=musterrolls&type=Normal' def grab_data_with_cookie(cookie_jar, url): opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie_jar)) data = opener.open(url) return data cj = cookielib.CookieJar() #grab the data data1 = grab_data_with_cookie(cj, url) #the second time we do this, we get back the excel sheet. data2 = grab_data_with_cookie(cj, url) stuff2 = data2.read()

2条回答

网友

1楼 · 编辑于 2024-05-14 21:12:07

使用requests这是一项微不足道的任务：

>>> url = 'http://httpbin.org/cookies/set/requests-is/awesome'
>>> r = requests.get(url)

>>> print r.cookies
{'requests-is': 'awesome'}

网友

2楼 · 编辑于 2024-05-14 21:12:07

使用cookies和urllib2：

import cookielib
import urllib2

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
# use opener to open different urls

可以将同一个打开器用于多个连接：

data = [opener.open(url).read() for url in urls]

或在全球安装：

urllib2.install_opener(opener)

在后一种情况下，无论是否支持cookies，其余代码看起来都一样：

data = [urllib2.urlopen(url).read() for url in urls]

相关问题更多 >

编程相关推荐

热门问题

热门文章