如何在Python中解析目录树?
我有一个叫“notes”的文件夹,在这个文件夹里有一些分类,比如“科学”、“数学”等等。在这些分类里面还有子分类,比如“量子力学”、“线性代数”。
./notes
--> ./notes/maths
------> ./notes/maths/linear_algebra
--> ./notes/physics/
------> ./notes/physics/quantum_mechanics
我现在的问题是,我不知道怎么把这些分类和子分类分成两个不同的列表或者数组。
2 个回答
1
os.walk这个功能非常适合这个需求。默认情况下,它会从上到下遍历文件夹,而且你可以很简单地在第二层的时候停止它,只需要把' dirnames'设置为空就可以了。
import os
pth = "/path/to/notes"
def getCats(pth):
cats = []
subcats = []
for (dirpath, dirnames, filenames) in os.walk(pth):
#print dirpath+"\n\t", "\n\t".join(dirnames), "\n%d files"%(len(filenames))
if dirpath == pth:
cats = dirnames
else:
subcats.extend(dirnames)
dirnames[:]=[] # don't walk any further downwards
# subcats = list(set(subcats)) # uncomment this if you want 'subcats' to be unique
return (cats, subcats)
16
你可以使用 os.walk
这个工具。
#!/usr/bin/env python
import os
for root, dirs, files in os.walk('notes'):
print(root, dirs, files)
简单的两层遍历:
import os
from os.path import isdir, join
def cats_and_subs(root='notes'):
"""
Collect categories and subcategories.
"""
categories = filter(lambda d: isdir(join(root, d)), os.listdir(root))
sub_categories = []
for c in categories:
sub_categories += filter(lambda d: isdir(join(root, c, d)),
os.listdir(join(root, c)))
# categories and sub_categories are arrays,
# categories would hold stuff like 'science', 'maths'
# sub_categories would contain 'Quantum Mechanics', 'Linear Algebra', ...
return (categories, sub_categories)
if __name__ == '__main__':
print(cats_and_subs(root='/path/to/your/notes'))