urllib2.urlopen()与urllib.urlopen()对比 - urllib2报404而urllib正常！为什么？

18 投票

1 回答

21098 浏览

提问于 2025-04-15 17:17

import urllib

print urllib.urlopen('http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/').read()

上面的脚本可以正常运行，并且得到了预期的结果，然而：

import urllib2

print urllib2.urlopen('http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/').read()

却出现了以下错误：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/urllib2.py", line 124, in urlopen
    return _opener.open(url, data)
  File "/usr/lib/python2.5/urllib2.py", line 387, in open
    response = meth(req, response)
  File "/usr/lib/python2.5/urllib2.py", line 498, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.5/urllib2.py", line 425, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.5/urllib2.py", line 506, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

有没有人知道这是为什么呢？我是在家里的笔记本电脑上运行这个，网络没有代理设置——就是直接从我的笔记本电脑连接到路由器，然后再连接到互联网。

1 个回答

这个网址确实会返回一个404错误，但里面有很多HTML内容。urllib2把这个情况（错误）处理得很正确。你可以这样获取那个网站的404页面的内容：

import urllib2
try:
    print urllib2.urlopen('http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/').read()
except urllib2.HTTPError, e:
    print e.code
    print e.msg
    print e.headers
    print e.fp.read()

回答于 2025-04-15 由 Python大师

分享举报

urllib2.urlopen()与urllib.urlopen()对比 - urllib2报404而urllib正常！为什么？

1 个回答

撰写回答