Python中大型目录/文件的元数据列表(MD5,修改时间、大小、路径)

2024-03-28 15:37:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在编写一个脚本,以对多达8 TB的目录中包含100多万个文件(包括一些文件~50 GB)进行指纹识别,并将结果导出到.csv中,例如“md5”、“LastWriteTime”、“filesize”、“fullpath”\文件.ext“:

"md5","YYYYMMDDHHMMSS","12345","A:\aaa\bb\c\file1.ext"

我被编码困住了,让输出.csv为空:

def md5(fname):
    hash_md5 = hashlib.md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(2 ** 20), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()
    def getSize(filename):
    st = os.stat(filename)
    return st.st_size()
    with open('md5_filelist.csv', 'w') as md5_filelist:
    file.write('hash_md5.hexdigest','timestamp','st.st_size','os.path.abspath')te')

我做错了什么(我是Python新手)?非常感谢。你知道吗


Tags: 文件csvreturndefaswithhashopen
1条回答
网友
1楼 · 发布于 2024-03-28 15:37:16

再试一次:

import hashlib
import os
import time

your_target_folder = "."


def get_size(filename):
    st = os.stat(filename)
    return str(st.st_size)


def get_last_write_time(filename):
    st = os.stat(filename)
    convert_time_to_human_readable = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(st.st_mtime))
    return convert_time_to_human_readable


def get_md5(fname):
    hash_md5 = hashlib.md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(2 ** 20), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()


for dirpath, _, filenames in os.walk(your_target_folder):

    for items in filenames:

        file_full_path = os.path.abspath(os.path.join(dirpath, items))

        try:

            my_last_data = get_md5(file_full_path) + ", " + get_last_write_time(file_full_path) + ", " + get_size(
                file_full_path) + ", " + file_full_path + "\n"

            with open("md5_filelist.csv", "a") as my_save_file:
                my_save_file.write(my_last_data)

            print(str(file_full_path) + "  ||| Done")

        except:
            print("Error On " + str(file_full_path))

我更改了fullpathaddress方法并添加了时间.strftime(“%Y-%m-%d%H:%m:%S”,时间.localtime(圣马丁时间))把时间转换成可读的格式。你知道吗

祝你好运。。。你知道吗

相关问题 更多 >