企业代理在C#中有效,但在Python中无效

2 投票
3 回答
1200 浏览
提问于 2025-04-18 13:39

我在公司网络的代理后面,想用Python下载网页源代码。我的一个同事用C#写了类似的程序,结果成功了,但我的Python代码却不行,虽然我们用的是相同的账号信息。下面是C#的代码:

class Program 
    { 
        static void Main(string[] args) 
        { 
            var netCred = new NetworkCredential { UserName = "asdf", Password = "pass", Domain = "Africa" }; 
            var webProxy = new WebProxy("corp_proxy:8080", true);   

            webProxy.Credentials = netCred; 

            using (WebClient client = new WebClient() { Proxy = webProxy }) 
            using (Stream data = client.OpenRead(@"http://www.google.com <http://www.google.com/> ")) 
            using (StreamReader reader = new StreamReader(data)) 
            { 
                client.Proxy = webProxy; 
                string s = reader.ReadToEnd(); 
                Console.WriteLine(s); 
            } 

            Console.ReadLine(); 
        } 
    }

下面是我的Python代码:

import urllib2

proxy_user = "Africa\\asdf"
proxy_password = "pass"
proxy_port = "8080"
proxy_url = "corp_proxy"

def proxy_test():

  proxy_tot = 'http://' + proxy_user + ':' + proxy_password + '@' + proxy_url + ':' + proxy_port
  proxy = urllib2.ProxyHandler({"http":proxy_tot})
  auth = urllib2.HTTPBasicAuthHandler()
  opener = urllib2.build_opener(proxy, auth, urllib2.HTTPHandler)
  urllib2.install_opener(opener)
  x = urllib2.urlopen('http://www.google.com')
  print x.read()

if __name__ == "__main__":
  proxy_test()

出现的错误信息是:

    Traceback (most recent call last):
  File ".\test.py", line 21, in <module>
    proxy_test()
  File ".\test.py", line 17, in proxy_test
    x = urllib2.urlopen('http://www.google.com')
  File "C:\Python27\Lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\Lib\urllib2.py", line 410, in open
    response = meth(req, response)
  File "C:\Python27\Lib\urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\Lib\urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\Python27\Lib\urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 407: Proxy Authentication Required

然后我尝试使用https,收到的错误是:

Traceback (most recent call last):
  File ".\test.py", line 21, in <module>
    proxy_test()
  File ".\test.py", line 17, in proxy_test
    x = urllib2.urlopen('http://www.google.com')
  File "C:\Python27\Lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\Lib\urllib2.py", line 404, in open
    response = self._open(req, data)
  File "C:\Python27\Lib\urllib2.py", line 422, in _open
    '_open', req)
  File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\Python27\Lib\urllib2.py", line 722, in <lambda>
    meth(r, proxy, type))
  File "C:\Python27\Lib\urllib2.py", line 751, in proxy_open
    return self.parent.open(req, timeout=req.timeout)
  File "C:\Python27\Lib\urllib2.py", line 404, in open
    response = self._open(req, data)
  File "C:\Python27\Lib\urllib2.py", line 422, in _open
    '_open', req)
  File "C:\Python27\Lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\Python27\Lib\urllib2.py", line 1222, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "C:\Python27\Lib\urllib2.py", line 1184, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:510: error:140770FC:SSL routines:SSL23
_GET_SERVER_HELLO:unknown protocol>

那我的Python代码哪里出问题了呢?

3 个回答

1

看起来你使用的是一个HTTP(而不是HTTPS)代理。

这个代理的回复显示它无法验证你的身份信息:HTTP错误407:需要代理身份验证

你可以尝试以下代码。你可以查看代理服务器返回的Proxy-Authentication头部,了解代理的具体要求。

proxy_handler = urllib2.ProxyHandler({'http': 'http://proxy.company.local:3128/'})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
proxy_auth_handler.add_password('Company Proxy Realm', 'proxy.company.local', 'username', 'password')

opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
opener.open('http://www.google.com')
opener.open('https://www.google.com')
1

如果你不一定要使用urllib2的话,使用requests库可能会更简单一些。

import requests

proxy_user = "Africa\\asdf"
proxy_password = "pass"
proxy_url = "http://corp_proxy:8080"

def proxy_test():
    proxy = {'http': proxy_url}
    auth = HTTPProxyAuth(proxy_user, proxy_password)
    r = requests.get('http://www.google.com/', proxies=proxy, auth=auth)
    print r.text

if __name__ == "__main__":
    proxy_test()

这个stackoverflow上的帖子会讲到这个问题,还有如何使用requests.Session对象,以及requests库中关于代理的更多信息。希望这些内容对你来说会更容易理解。

2

因为你的代理使用的是NTLM认证,所以你需要使用一个兼容的 AuthHandler,比如 ProxyNtlmAuthHandler

撰写回答