在Python中复制zip文件内容

1 投票
2 回答
4741 浏览
提问于 2025-04-18 13:50

我想要在一个网络目录里递归搜索,找出所有的 .xls 文件,这些文件是在 zip 压缩包里面的。对于每一个在 zip 文件中找到的 XLS 文件,我希望把它复制到本地的 C: 盘里。以下是我目前写的脚本:

import os
import zipfile
import fnmatch
import shutil

rootPath = "L:\Data\Cases"
destPath = "C:\Test"
allFileList = []
zipList = []

# Create a list containing all files contained within L:\Data\Cases
for dirname, dirnames, filenames in os.walk(rootPath):
    for filename in filenames:
        allFileList.append(os.path.join(dirname, filename))

# Return a list of filepaths containing zipfiles.
for file in allFileList:
    if file.endswith(".zip"):
        zipList.append(file)

for file in zipList:
    with zipfile.ZipFile(file) as zip_file:
        for member in zip_file.namelist():
            if member.endswith(".xls"):
                filename = os.path.basename(member)
                if not filename:
                    continue
                source = zip_file.open(member)
                target = os.path.join(destPath, filename)
                shutil.copy2(source, target)

下面是错误代码。我觉得这个错误是因为我试图把压缩包里的文件复制到目标路径导致的。

Traceback (most recent call last):
  File "C:/Users/user/Desktop/parsecsv.py", line 30, in <module>
    shutil.copy2(source, target)
  File "C:\Program Files\Python278\lib\shutil.py", line 130, in copy2
    copyfile(src, dst)
  File "C:\Program Files\Python278\lib\shutil.py", line 68, in copyfile
    if _samefile(src, dst):
  File "C:\Program Files\Python278\lib\shutil.py", line 63, in _samefile
    return (os.path.normcase(os.path.abspath(src)) ==
  File "C:\Program Files\Python278\lib\ntpath.py", line 487, in abspath
    path = _getfullpathname(path)

有没有什么建议呢?

2 个回答

1

ZipFile.open() 并不是返回一个文件系统的路径,而是返回一个类似文件的 ZipExtFile 对象。如果你想要的是文件的实际内容,可以使用 ZipFile.extract()(这样你就完全不需要 shutil.copy() 了):

# NB : untested code, refer to the doc for more infos
for file in zipList:
    with zipfile.ZipFile(file) as zip_file:
        for member in zip_file.namelist():
            if member.endswith(".xls"):
                zip_file.extract(member, destPath)

另外,顺便说一下,你不需要先创建一个所有文件的列表,再创建一个压缩文件的列表,然后再遍历这个列表——你可以一次性完成所有操作:

for dirname, dirnames, filenames in os.walk(rootPath):
    for filename in filenames:
        if not filename.endswith(".zip"):
            continue 
        fullpath = os.path.join(dirname, filename))
        with zipfile.ZipFile(fullpath) as zip_file:
            for member in zip_file.namelist():
                if member.endswith(".xls"):
                    zip_file.extract(member, destPath)
2

正如bruno提到的,我觉得你不能直接检查一个压缩文件的内容。不过,我觉得有个更简单的方法,就是在解压后把不需要的东西删除掉。你可以用shutil.rmtree来删除那些多余的文件。

def main():
rootPath = "C:\\rootpath"
destPath = "C:\\Test"
allFileList = []
zipList = []
# Create a list containing all files contained within L:\Data\Cases
for dirname, dirnames, filenames in os.walk(rootPath):
    for filename in filenames:
        allFileList.append(os.path.join(dirname, filename))

# Return a list of filepaths containing zipfiles.
for file in allFileList:
    if file.endswith(".zip"):
        zipList.append(file)


for file in zipList:
    with zipfile.ZipFile(file) as zip_file:
        for member in zip_file.namelist():
            if member.endswith(".xls"):
                zip_file.extract(member, destPath)

for dirname, dirnames, filenames in os.walk(destPath):
    for filename in filenames:
        if not filename.endswith(".xls"):
            shutil.rmtree(filename)

if __name__ == '__main__':
main()

撰写回答