企业代理在C#中有效,但在Python中无效
我在公司网络的代理后面,想用Python下载网页源代码。我的一个同事用C#写了类似的程序,结果成功了,但我的Python代码却不行,虽然我们用的是相同的账号信息。下面是C#的代码:
class Program
{
static void Main(string[] args)
{
var netCred = new NetworkCredential { UserName = "asdf", Password = "pass", Domain = "Africa" };
var webProxy = new WebProxy("corp_proxy:8080", true);
webProxy.Credentials = netCred;
using (WebClient client = new WebClient() { Proxy = webProxy })
using (Stream data = client.OpenRead(@"http://www.google.com <http://www.google.com/> "))
using (StreamReader reader = new StreamReader(data))
{
client.Proxy = webProxy;
string s = reader.ReadToEnd();
Console.WriteLine(s);
}
Console.ReadLine();
}
}
下面是我的Python代码:
import urllib2
proxy_user = "Africa\\asdf"
proxy_password = "pass"
proxy_port = "8080"
proxy_url = "corp_proxy"
def proxy_test():
proxy_tot = 'http://' + proxy_user + ':' + proxy_password + '@' + proxy_url + ':' + proxy_port
proxy = urllib2.ProxyHandler({"http":proxy_tot})
auth = urllib2.HTTPBasicAuthHandler()
opener = urllib2.build_opener(proxy, auth, urllib2.HTTPHandler)
urllib2.install_opener(opener)
x = urllib2.urlopen('http://www.google.com')
print x.read()
if __name__ == "__main__":
proxy_test()
出现的错误信息是:
Traceback (most recent call last):
File ".\test.py", line 21, in <module>
proxy_test()
File ".\test.py", line 17, in proxy_test
x = urllib2.urlopen('http://www.google.com')
File "C:\Python27\Lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\Lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\Lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\Lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\Lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 407: Proxy Authentication Required
然后我尝试使用https,收到的错误是:
Traceback (most recent call last):
File ".\test.py", line 21, in <module>
proxy_test()
File ".\test.py", line 17, in proxy_test
x = urllib2.urlopen('http://www.google.com')
File "C:\Python27\Lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\Lib\urllib2.py", line 404, in open
response = self._open(req, data)
File "C:\Python27\Lib\urllib2.py", line 422, in _open
'_open', req)
File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\Lib\urllib2.py", line 722, in <lambda>
meth(r, proxy, type))
File "C:\Python27\Lib\urllib2.py", line 751, in proxy_open
return self.parent.open(req, timeout=req.timeout)
File "C:\Python27\Lib\urllib2.py", line 404, in open
response = self._open(req, data)
File "C:\Python27\Lib\urllib2.py", line 422, in _open
'_open', req)
File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\Lib\urllib2.py", line 1222, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python27\Lib\urllib2.py", line 1184, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:140770FC:SSL routines:SSL23
_GET_SERVER_HELLO:unknown protocol>
那我的Python代码哪里出问题了呢?
3 个回答
1
看起来你使用的是一个HTTP(而不是HTTPS)代理。
这个代理的回复显示它无法验证你的身份信息:HTTP错误407:需要代理身份验证
。
你可以尝试以下代码。你可以查看代理服务器返回的Proxy-Authentication头部,了解代理的具体要求。
proxy_handler = urllib2.ProxyHandler({'http': 'http://proxy.company.local:3128/'})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password('Company Proxy Realm', 'proxy.company.local', 'username', 'password')
opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
opener.open('http://www.google.com')
opener.open('https://www.google.com')
1
如果你不一定要使用urllib2的话,使用requests库可能会更简单一些。
import requests
proxy_user = "Africa\\asdf"
proxy_password = "pass"
proxy_url = "http://corp_proxy:8080"
def proxy_test():
proxy = {'http': proxy_url}
auth = HTTPProxyAuth(proxy_user, proxy_password)
r = requests.get('http://www.google.com/', proxies=proxy, auth=auth)
print r.text
if __name__ == "__main__":
proxy_test()
这个stackoverflow上的帖子会讲到这个问题,还有如何使用requests.Session对象,以及requests库中关于代理的更多信息。希望这些内容对你来说会更容易理解。
2
因为你的代理使用的是NTLM认证,所以你需要使用一个兼容的 AuthHandler
,比如 ProxyNtlmAuthHandler
。