如何用Python获取wget命令信息?

0 投票
3 回答
3066 浏览
提问于 2025-04-18 18:32

在Linux(Ubuntu)系统上,当我运行 wget www.example.com/file.zip -O file.zip 这个命令时,我可以看到一个进度条,显示下载的进度。下面的图片就是这个进度条的样子:

enter image description here

我想知道在Python中有没有办法获取我用红色框起来的所有信息?

也就是说,我希望能把这些信息分别存到Python的变量里:

  • 正在下载的数据量
  • 下载速度
  • 剩余时间

3 个回答

0

你可以使用 subprocess 这个模块:

import subprocess

process = subprocess.Popen(
['wget', 'http://speedtest.dal01.softlayer.com/downloads/test10.zip', '-O', '/dev/null'],
stderr=subprocess.PIPE)

started = False
for line in process.stderr:
    line = line.decode("utf-8", "replace")
    if started:
        print(line.split())
    elif line == os.linesep:
        started = True

现在你只需要处理 line.split() 的输出,并修改 wget 的参数(这只是为了测试,不会保存下载的数据)。

这个方法在 Windows 系统上,使用 Python 3.4 是可以运行的:

import subprocess
import os
import sys

wget = os.path.join("C:\\" , "Program Files (x86)", "GnuWin32", "bin", "wget.exe")

process = subprocess.Popen(
    [wget, 'http://speedtest.dal01.softlayer.com/downloads/test10.zip', '-O', 'NUL'],
    stderr=subprocess.PIPE)

started = False
for line in process.stderr:
    line = line.decode("utf-8", "replace")
    if started:
        splited = line.split()
        if len(splited) == 9:
            percentage = splited[6]
            speed = splited[7]
            remaining = splited[8]
            print("Downloaded {} with {} per second and {} left.".format(percentage, speed, remaining), end='\r')
    elif line == os.linesep:
        started = True
2

你可以用Python的urllib库和一个自定义的函数来实现你自己的wget

def reporthook(count_blocks, block_size, total_size):
    global start_time
    if count == 0:
      start_time = time.time()
      return
    duration = time.time() - start_time
    progress_size = int(count_blocks * block_size)
    print "downloaded %f%%" % count_blocks/float(total_size)
    # etc ...

urllib.urlretrieve(url, filename, reporthook)

(你也可以查看这个链接:https://stackoverflow.com/a/4152008/2314737

这里有一个完整的Python 3实现:https://pypi.python.org/pypi/wget

0

这些信息是输出到 stderr 的,所以你需要从 sys.stderr 读取它们。我们可以使用 select 来读取 stderr,因为输出是会变化的。以下是一个示例:

# -*- coding: utf-8 -*-
from subprocess import PIPE, Popen
import fcntl
import os
import select
import sys

proc = Popen(['wget', 'http://speedtest.london.linode.com/100MB-london.bin'], stdin = PIPE, stderr = PIPE, stdout = PIPE)

while proc.poll() == None:
    fcntl.fcntl(
            proc.stderr.fileno(),
            fcntl.F_SETFL,
            fcntl.fcntl(proc.stderr.fileno(), fcntl.F_GETFL) | os.O_NONBLOCK,
            )

    buf = ''
    while proc.poll() == None:
        readx_err = select.select([proc.stderr.fileno()], [], [], 0.1)[0]
        if readx_err:
            chunk = proc.stderr.read().decode('utf-8')
            buf += chunk
            if '\n' in buf and '%' in buf and '.' in buf:
                print (buf.strip().split())
                buf = ''
        else:
            break
proc.wait()

撰写回答