Python urllib2 网络重新连接时无法恢复下载
我正在使用urllib2来制作一个可以恢复下载的工具,基本上是参考了这个方法。我可以结束程序然后重新启动,它会从上次停止的地方继续下载,最终下载的文件大小和一次性下载的文件是一样的。
不过,我测试了在关闭和重新开启网络的情况下,它的下载效果不太好。下载的文件大小比应该的要大,而且文件也无法正常使用。我是不是漏掉了什么,还是说这可能是urllib2的一个bug呢?
import urllib2
opener = urllib2.build_opener();
self.count = 0 # Counts downloaded size.
self.downloading = True
while (not(self.success) and self.downloading):
try:
self.Err = ""
self._netfile = self.opener.open(self.url)
self.filesize = float(self._netfile.info()['Content-Length'])
if (os.path.exists(self.localfile) and os.path.isfile(self.localfile)):
self.count = os.path.getsize(self.localfile)
print self.count,"of",self.filesize,"downloaded."
if self.count >= self.filesize:
#already downloaded
self.downloading = False
self.success = True
self._netfile.close()
return
if (os.path.exists(self.localfile) and os.path.isfile(self.localfile)):
#File already exists, start where it left off:
#This seems to corrupt the file sometimes?
self._netfile.close()
req = urllib2.Request(self.url)
print "file downloading at byte: ",self.count
req.add_header("Range","bytes=%s-" % (self.count))
self._netfile = self.opener.open(req)
if (self.downloading): #Don't do it if cancelled, downloading=false.
next = self._netfile.read(1024)
self._outfile = open(self.localfile,"ab") #to append binary
self._outfile.write(next)
self.readsize = desc(self.filesize) # get size mb/kb
self.count += 1024
while (len(next)>0 and self.downloading):
next = self._netfile.read(1024)
self._outfile.write(next)
self.count += len(next)
self.success = True
except IOError, e:
print e
self.Err=("Download error, retrying in a few seconds: "+str(e))
try:
self._netfile.close()
except Exception:
pass
time.sleep(8) #Then repeat
1 个回答
1
我在处理IO错误的时候,加上了self._outfile.close()和self._netfile.close()这两行代码,结果问题好像解决了。我想这个错误是因为在没有关闭文件的情况下,又试图重新打开它来追加内容。