Python urllib 和 urllib2 不打开 localhost URL?
在Python中,我可以使用urllib2(还有urllib)来打开像谷歌这样的外部网址。不过,当我尝试打开本地网址时遇到了一些问题。我在端口8280上运行了一个Python的SimpleHTTPServer,我可以通过http://localhost:8280/成功访问它。
python -m SimpleHTTPServer 8280
值得一提的是,我正在使用Ubuntu系统,并且有一个叫做CNTLM的工具在运行,它负责处理我们公司网络代理的认证。因此,wget在本地网址上也无法正常工作,所以我觉得这不是urllib的问题!
测试脚本(test_urllib2.py):
import urllib2
print "Opening Google..."
google = urllib2.urlopen("http://www.google.com/")
print google.read(100)
print "Google opened."
print "Opening localhost..."
localhost = urllib2.urlopen("http://localhost:8280/")
print localhost.read(100)
print "localhost opened."
输出:
$ ./test_urllib2.py
Opening Google...
<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><
Google opened.
Opening localhost...
Traceback (most recent call last):
File "./test_urllib2.py", line 10, in <module>
localhost = urllib2.urlopen("http://localhost:8280/")
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 429, in error
result = self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 605, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1161, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.6/urllib2.py", line 1134, in do_open
r = h.getresponse()
File "/usr/lib/python2.6/httplib.py", line 986, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 355, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine
解决方案:问题确实是因为我在公司网络代理后面使用了CNTLM(具体为什么会造成问题我不太确定)。解决办法是使用ProxyHandler:
proxy_support = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_support)
print opener.open("http://localhost:8380/").read(100)
感谢loki2302给我指引到这里。
3 个回答
1
我在我的网页服务器上也遇到了这个问题。问题的根本原因是我的网页服务器是单线程的,只能处理一个请求。所以在处理一个请求的过程中,它无法同时处理我在urllib2中请求的另一个网址。
3
试试用urllib这个库:
import urllib
localhost = urllib.urlopen("http://localhost:8280/")
print localhost.read(100)
5
检查一下问题到底是出在打开本地服务器(localhost)上,还是JBoss给出了无效的响应(浏览器可能会绕过这个问题):
- 试试用 http://127.0.0.1:8280/ 来代替 "localhost:8280"(如果这样可以打开,那就是DNS的问题)
- 用curl或者wget来测试JBoss是否正常工作:
wget http://localhost:8280/
你可以试着运行一个简单的Python HTTP服务器,看看能不能连接到其他的服务,而不是JBoss:
python -m SimpleHTTPServer 8280