Python的urllib.request.urlopen在网络中断时的行为

4 投票
1 回答
2405 浏览
提问于 2025-04-18 12:12

我在使用Python的urllib时遇到了一些问题,特别是当网络连接不稳定的时候:如果第一次调用urllib.request.urlopen时没有网络连接,我就无法获取信息。

 > python
 >>> import urllib.request
 >>> urllib.request.urlopen("http://www.google.com")
 <http.client.HTTPResponse object at 0x7f6f54681438>

 #Now disable internet connection:
 > sudo ip link set enp4s0 down

 >>> urllib.request.urlopen("http://www.google.com")
 Traceback (most recent call last):
   File "/usr/lib/python3.4/urllib/request.py", line 1189, in do_open
     h.request(req.get_method(), req.selector, req.data, headers)
   File "/usr/lib/python3.4/http/client.py", line 1090, in request
     self._send_request(method, url, body, headers)
   File "/usr/lib/python3.4/http/client.py", line 1128, in _send_request
     self.endheaders(body)
   File "/usr/lib/python3.4/http/client.py", line 1086, in endheaders
     self._send_output(message_body)
   File "/usr/lib/python3.4/http/client.py", line 924, in _send_output
     self.send(msg)
   File "/usr/lib/python3.4/http/client.py", line 859, in send
     self.connect()
   File "/usr/lib/python3.4/http/client.py", line 836, in connect
     self.timeout, self.source_address)
   File "/usr/lib/python3.4/socket.py", line 491, in create_connection
     for res in getaddrinfo(host, port, 0, SOCK_STREAM):
   File "/usr/lib/python3.4/socket.py", line 530, in getaddrinfo
     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
 socket.gaierror: [Errno -2] Name or service not known

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
     return opener.open(url, data, timeout)
   File "/usr/lib/python3.4/urllib/request.py", line 455, in open
     response = self._open(req, data)
   File "/usr/lib/python3.4/urllib/request.py", line 473, in _open
     '_open', req)
   File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
     result = func(*args)
   File "/usr/lib/python3.4/urllib/request.py", line 1215, in http_open
     return self.do_open(http.client.HTTPConnection, req)
   File "/usr/lib/python3.4/urllib/request.py", line 1192, in do_open
     raise URLError(err)
 urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

 #Reenable internet connection:
 > sudo ip link set enp4s0 up #and wait a bit

 >>> urllib.request.urlopen("http://www.google.com")
 <http.client.HTTPResponse object at 0x7f6f5468c898>

到目前为止一切正常。现在做完全相同的事情,但第一次没有调用urlopen:

 > python
 >>> import urllib.request
 # do not call urlopen before internet is down...


 #Now disable internet connection:
 > sudo ip link set enp4s0 down

 >>> urllib.request.urlopen("http://www.google.com")
 [exactly the same error message as above]

 #Reenable internet connection:
 > sudo ip link set enp4s0 up #and wait a bit

 #Ensure internet connection is up
 > ip link show enp4s0 up
 2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP [...] 


 >>> urllib.request.urlopen("http://www.google.com")
 [exactly the same error message as above]
 #What's the problem? The internet connection IS up

 #However:
 > host www.google.com
 www.google.com has address 173.194.69.104
 [...]
 >>> urllib.request.urlopen("http://173.194.69.104")
 <http.client.HTTPResponse object at 0x7f3116a72e48>

所以我想这可能和DNS缓存有关?

最后,关于我的系统的一些信息:

 > python --version
 Python 3.4.1
 > uname -a
 Linux charon 3.15.3-1-ARCH #1 SMP PREEMPT Tue Jul 1 07:32:45 CEST 2014 x86_64 GNU/Linux

抱歉格式有点奇怪。我把'正常'(以'>'开头)和Python(以'>>>'开头)的命令搞混了,目的是为了让命令的顺序更清楚(显然是在不同的终端中发生的)。

1 个回答

2

你遇到了一个大家都知道的glibc问题。有人可能会争论这是glibc的错误用法,还是glibc本身有问题。res_init这个函数不是POSIX标准的一部分,而是源自BSD系统的接口,所以在不同的平台上很难做到完全正确。

目前似乎没有关于这个问题的python错误报告,所以你可能想要提交一个

作为一种解决方法,你可以使用ctypes自己调用res_init,但我现在不太确定具体该怎么做。

撰写回答