自动为每个折叠创建数据框架

2024-04-27 04:29:35 发布

您现在位置:Python中文网/ 问答频道 /正文

每个文件夹一年中每个月都有一个csv(1.csv、2.csv、3.csv等),脚本创建一个数据帧,将所有12个csv的第9列合并到一个名为concentrated.xlsx的xlsx表中。它可以工作,但一次只能用于一个目录

files = glob['2014/*.csv']
sorted_files = natsorted(files)

def read_9th(fn):
    return pd.read_csv(fn, usecols=[9], names=headers)

big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl') 
big_df.to_excel(writer,'2014')

writer.save()

是否可以自动为每个目录创建一个数据帧,而不必手动为每个文件夹创建一个数据帧,如下所示:

files14 = glob['2014/*.csv']
files15 = glob['2015/*.csv']

sorted_files14 = natsorted(files14)
sorted_files15 = natsorted(files15)

def read_9th(fn):
    return pd.read_csv(fn, usecols=[9], names=headers)

big_df = pd.concat([read_9th(fn) for fn in sorted_files14], axis=1)
big_df1 = pd.concat([read_9th(fn) for fn in sorted_files15], axis=1)
writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl') 
big_df.to_excel(writer,'2014')
big_df1.to_excel(writer,'2015')

writer.save()

Tags: csv数据dfreadfilesxlsxglobwriter
1条回答
网友
1楼 · 发布于 2024-04-27 04:29:35

如果您得到一个要处理的文件夹列表,例如

folders = os.listdir('.')
# or 
folders = ['2014', '2015', '2016']

你可以这样做:

writer = pd.ExcelWriter('concentrated.xlsx', engine='openpyxl')
for folder in folders:
    files = glob('%s/*.csv' % folder)
    sorted_files = natsorted(files)

    big_df = pd.concat([read_9th(fn) for fn in sorted_files], axis=1) 
    big_df.to_excel(writer, folder)

writer.save()

相关问题 更多 >