如何使用urllib登录,然后在Python中使用该权限访问网页?

2024-04-25 23:16:10 发布

您现在位置:Python中文网/ 问答频道 /正文

所以,我要做的是登录墙基.cc然后获取NSFW墙纸的标签(您需要为此登录)。似乎我可以很好地登录,但当我尝试访问墙纸页时,它抛出了一个403错误。这是我使用的代码:

import urllib2
import urllib
import cookielib
import re

username = 'xxxx'
password = 'xxxx'

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
payload = {
    'csrf' : '371b3b4bd0d1990048354e2056cd36f20b1d7088',
    'ref' : 'aHR0cDovL3dhbGxiYXNlLmNjLw==',
    'username' : username,
    'password' : password
    }
login_data = urllib.urlencode(payload)
req = urllib2.Request('http://wallbase.cc/user/login', login_data)

url = "http://wallbase.cc/wallpaper/2098029"

#Opens url of each pic
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()

你知道吗? 顺便说一句,使用的墙纸实际上不是新南威尔士州它是错误的标记。你知道吗


Tags: importurl墙纸data错误usernameloginpassword
1条回答
网友
1楼 · 发布于 2024-04-25 23:16:10

你可以试试这个库http://wwwsearch.sourceforge.net/mechanize/

举个例子:

import re
import mechanize

br = mechanize.Browser()
br.open("http://www.example.com/")
# follow second link with element text matching regular expression
response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)
assert br.viewing_html()
print br.title()
print response1.geturl()
print response1.info()  # headers
print response1.read()  # body

br.select_form(name="order")
# Browser passes through unknown attributes (including methods)
# to the selected HTMLForm.
br["cheeses"] = ["mozzarella", "caerphilly"]  # (the method here is __setitem__)
# Submit current form.  Browser calls .close() on the current response on
# navigation, so this closes response1
response2 = br.submit()

# print currently selected form (don't call .submit() on this, use br.submit())
print br.form

response3 = br.back()  # back to cheese shop (same data as response1)
# the history mechanism returns cached response objects
# we can still use the response, even though it was .close()d
response3.get_data()  # like .seek(0) followed by .read()
response4 = br.reload()  # fetches from server

for form in br.forms():
print form
# .links() optionally accepts the keyword args of .follow_/.find_link()
for link in br.links(url_regex="python.org"):
print link
    br.follow_link(link)  # takes EITHER Link instance OR keyword args
    br.back()

相关问题 更多 >