这些错误是什么,我该如何处理?

0 投票
2 回答
2044 浏览
提问于 2025-04-15 16:31

我正在使用这段简单的代码

for l in bios:
    OpenThisLink = url + l
    response = urllib2.urlopen(OpenThisLink)

来打开大约200个网址,并用正则表达式(和BeautifulSoup)进行搜索。但是在打开十几个网址后,我就遇到了这些错误,IDLE也自动退出了。这些错误是什么意思?我该如何处理它们呢?

谢谢。

Traceback (most recent call last):

  File "\PROJECTS\JD\jd10.py", line 15, in <module> response = urllib2.urlopen(OpenThisLink)

  File "C:\Python26\lib\urllib2.py", line 124, in urlopen return _opener.open(url, data, timeout)

  File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)

  File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)

  File "C:\Python26\lib\urllib2.py", line 421, in error result = self._call_chain(*args)

  File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)

  File "C:\Python26\lib\urllib2.py", line 597, in http_error_302 return self.parent.open(new)

  File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)

  File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)

  File "C:\Python26\lib\urllib2.py", line 421, in error result = self._call_chain(*args)

  File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)

  File "C:\Python26\lib\urllib2.py", line 597, in http_error_302 return self.parent.open(new)

  File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)

  File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)

  File "C:\Python26\lib\urllib2.py", line 427, in error return self._call_chain(*args)

  File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)

  File "C:\Python26\lib\urllib2.py", line 510, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) HTTPError: HTTP Error 404: Not Found

2 个回答

2

我对你使用的具体库不太了解。不过,从我看到的情况来看,这似乎是一个很长的错误追踪,最后指向了这个原始错误:

HTTP错误:HTTP错误404:未找到

我觉得其中有一个链接是坏的,这导致了一个异常(错误),但没有被处理。

补充一下:我说的“坏”是指服务器无法找到这个页面,所以才出现了404错误。

3

这里出现的错误是 HTTPError,具体来说,是因为你访问的某个网址返回了404错误。这意味着你请求的页面不存在。你可以选择忽略这个错误:

for l in bios:
    OpenThisLink = url + l
    try:
        response = urllib2.urlopen(OpenThisLink)
    except urllib2.HTTPError:
        pass

或者,你也可以重新抛出这个错误,并附上一个(稍微)更有意义的提示信息:

for l in bios:
    OpenThisLink = url + l
    try:
        response = urllib2.urlopen(OpenThisLink)
    except urllib2.HTTPError as e:
        raise Exception('Error opening %s: %s' % (e.geturl(), e))

撰写回答