Python脚本生成CSV文件,文件夹名包括它们的关联文件

2024-04-27 05:40:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我的目标是生成一个CSV文件,其中列出项目名称和与之相关的文档。项目名称将是文件夹名称(e.x.Project1,Project2),文档将是位于文件夹中的文件。在

CSV文件的理想输出

  • 项目名称文件
  • 项目1__________测试.txt_________测试.ppt在
  • 项目2__________工资单.ppt在

文件夹结构

C:\SHH\Testenv

C:\SHH\Testenv\Project1

C:\SHH\Testenv\Project2

C:\SHH\Testenv\Project1\test.txt

C:\SHH\Testenv\Project1\test.ppt

C:\SHH\Testenv\Project2\payroll.ppt

我试过的密码

import os
import xlwt 
import csv 
from os import walk

path = 'C:\SHH\Testenv'  
folders = [] # list that will contain folder names (basicaly the project names)
pathf = [] # list that will contain the directory of each folder 
files = [] # list of files in a folder (basically documents for each project) 

for item in os.listdir(path):
    if not os.path.isfile(os.path.join(path, item)):
        folders.append(os.path.join(item)) 
    pathf.append(os.path.join(path,item)) 

for x in pathf : 
    for (dirpath, dirnames, filenames) in walk(x):
        files.extend(filenames)
        print files

我被困在将每个文件关联到各自的文件夹,然后将其打印到CSV文件

提前谢谢你


Tags: 文件csvpathinimport文件夹foros
3条回答

^{}和{a2}您是此任务的朋友:

import os
import csv

path = '/tmp/SSH/Testenv'

with open('/tmp/output.csv', 'wb') as csvfile:
  writer = csv.writer(csvfile)
  writer.writerow(['Project Name', 'Documents'])
  for dirpath, _, filenames in os.walk(path):
    if filenames:
      writer.writerow([os.path.basename(dirpath)] + filenames)

或者,如果您更喜欢生成器表达式:

^{pr2}$

结果:

Project Name,Documents
Project2,payroll.ppt
Project1,test.ppt,test.txt


编辑:输出未排序令我烦恼。以下是对项目进行排序的版本,每个项目中的文件都进行了排序:
with open('/tmp/output.csv', 'wb') as csvfile:
  writer = csv.writer(csvfile)
  writer.writerow(['Project Name', 'Documents'])
  for dirpath, dirs, filenames in os.walk(path, topdown=True):
    dirs.sort()
    if filenames:
      writer.writerow([os.path.basename(dirpath)] + sorted(filenames))

结果:

Project Name,Documents
Project1,test.ppt,test.txt
Project2,payroll.ppt

在进入下一个项目/目录之前,完全处理一个项目/目录可能更容易。另外,字典似乎是最理想的结构。在

import os

path = 'C:\SHH\Testenv'
projects = {}

for item in os.listdir(path):
    current = os.path.join(path, item)
    if os.path.isdir(current):
        projects[item] = []
        for f in os.listdir(current):
            if os.path.isfile(os.path.join(current, f)):
                projects[item].append(f)

f = open('projects.csv', 'w')
f.write('Project Name____Documents\n')
for p in projects:
    f.write(p + '____' + '____'.join(projects[p]) + '\n')

f.close()

第一步是获取根目录,即项目(os.path.isdir())。我们在dict中为项目创建一个条目,保存一个空列表。接下来,列出此项目目录中的所有文件并将其添加到列表中。
因为你没有一个典型的csv结构,我只使用了普通的文件I/O。项目名称和文档用四个下划线分开,但是你可以很容易地调整它。在

试试看

from os import walk, listdir
from os.path import join, isfile

path = 'C:\SHH\Testenv'

# use walk
for (dirpath, dirnames, filenames) in walk(path):                 
    # at every directory, check if there is at least one file
    # i.e. check that it is neither empty nor full of other directories
    files_found = False
    for dir_f in os.listdir(dirpath):
        if isfile(join(dirpath,dir_f)):
            files_found = True
            break

    # if we found at least one file, output csv-style format
    if files_found:
        print dirpath + "," + ",".join([f for f in os.listdir(dirpath) if isfile(join(dirpath,f))])

还要注意os.path.join()(连接路径)和{}(此处用作",".join(...))之间的区别,后者用分隔符连接字符串序列,在本例中是逗号(,)。在

相关问题 更多 >