Curl 只在不为 404 时保存

0 投票

5 回答

5492 浏览

提问于 2025-04-17 09:40

我正在写一个Python程序，用来下载我学校学生的一些照片。

这是我的代码：

import os
count = 0
max_c = 1000000
while max_c >= count:
    os.system("curl http://www.tjoernegaard.dk/Faelles/ElevFotos/"+str(count)+".jpg > "+str(count)+".jpg")
    count=count+1

问题是，我只想保存那些在服务器上存在的jpg图片（也就是不是404错误），但我并不知道服务器上所有图片的名字，所以我需要发送请求，检查从0到1000000之间的所有图片，但并不是所有这些图片都存在。因此，我只想在图片确实存在的情况下才保存它。请问我该怎么做（在Ubuntu系统上）？

提前谢谢你！

5 个回答

我建议你使用Python提供的urllib库来实现你的需求。

count = 0
max_c = 1000000
while max_c >= count:
    resp = urllib.urlopen("http://www.tjoernegaard.dk/Faelles/ElevFotos/"+str(count)+".jpg")
    if resp.getcode() == 404:
      //do nothing
    else:
    // do what you got to do.

   count=count+1

回答于 2025-04-17 由 Python大师

分享举报

你可以使用 "-f" 这个参数来让程序在出错时不显示错误信息，也就是不打印HTTP错误，比如：

curl -f site.com/file.jpg

回答于 2025-04-17 由 Python大师

分享举报

import urllib2
import sys

for i in range(1000000):
  try:
    pic = urllib2.urlopen("http://www.tjoernegaard.dk/Faelles/ElevFotos/"+str(i)+".jpg").read()
    with open(str(i).zfill(7)+".jpg") as f:
      f.write(pic)
    print "SUCCESS "+str(i)
  except KeyboardInterrupt:
    sys.exit(1)
  except urllib2.HTTPError, e:
    print "ERROR("+str(e.code)+") "+str(i)

应该没问题，404错误会抛出一个异常。

回答于 2025-04-17 由 Python大师

分享举报

Curl 只在不为 404 时保存

5 个回答

撰写回答