如何列出目录中的所有文件？问题的回答

如何列出目录中的所有文件？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<h2>初步说明</h2> <ul> <li>尽管问题文本中的文件和目录术语之间有明显的区别，但有些人可能认为目录实际上是特殊文件</li> <li>语句：“目录的所有文件”可以用两种方式解释： <ol> <li>所有直接的（或级别1）子代仅</li> <li>整个目录树中的所有子目录（包括子目录中的子目录）</li> </ol></li> <li>当被问到这个问题时，我认为PythonPython2是LTS版本，但是代码示例将由Python3（.5）运行（我将尽可能使它们与Python 2兼容；而且，我要发布的任何属于Python的代码都是从v3.5.4-除非另有规定）。这与问题中的另一个关键字有关：“将它们添加到列表中”“： <ul> <li>在prePython 2.2版本中，序列（iterable）主要由列表（元组、集合等）表示</li> <li>在Python 2.2中，引入了生成器的概念（由<a href="https://docs.python.org/3/reference/simple_stmts.html#the-yield-statement" rel="noreferrer">[Python 3]: The yield statement</a>提供）。随着时间的推移，对于返回/使用列表的函数，生成器对应项开始出现</li> <li>在Python 3中，generator是默认行为</li> <li>不确定返回列表是否仍然是必需的（或者生成器也可以），但是将生成器传递给list构造函数，将从中创建一个列表（同时使用它）。下面的示例说明了<a href="https://docs.python.org/3/library/functions.html#map" rel="noreferrer">[Python 3]: map(function, iterable, ...)</a>上的差异</li> </ul> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import sys >>> sys.version '2.7.10 (default, Mar 8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]' >>> m = map(lambda x: x, [1, 2, 3]) # Just a dummy lambda function >>> m, type(m) ([1, 2, 3], <type 'list'>) >>> len(m) 3 </code></pre> </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import sys >>> sys.version '3.5.4 (v3.5.4:3f56838, Aug 8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)]' >>> m = map(lambda x: x, [1, 2, 3]) >>> m, type(m) (<map object at 0x000001B4257342B0>, <class 'map'>) >>> len(m) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: object of type 'map' has no len() >>> lm0 = list(m) # Build a list from the generator >>> lm0, type(lm0) ([1, 2, 3], <class 'list'>) >>> >>> lm1 = list(m) # Build a list from the same generator >>> lm1, type(lm1) # Empty list now - generator already consumed ([], <class 'list'>) </code></pre> </blockquote></li> <li>这些示例将基于一个名为root\u dir的目录，该目录具有以下结构（此示例用于Win，但我在Lnx上使用相同的树）： <blockquote> <pre class="lang-py prettyprint-override"><code>E:\Work\Dev\StackOverflow\q003207219>tree /f "root_dir" Folder PATH listing for volume Work Volume serial number is 00000029 3655:6FED E:\WORK\DEV\STACKOVERFLOW\Q003207219\ROOT_DIR ¦ file0 ¦ file1 ¦ +---dir0 ¦ +---dir00 ¦ ¦ ¦ file000 ¦ ¦ ¦ ¦ ¦ +---dir000 ¦ ¦ file0000 ¦ ¦ ¦ +---dir01 ¦ ¦ file010 ¦ ¦ file011 ¦ ¦ ¦ +---dir02 ¦ +---dir020 ¦ +---dir0200 +---dir1 ¦ file10 ¦ file11 ¦ file12 ¦ +---dir2 ¦ ¦ file20 ¦ ¦ ¦ +---dir20 ¦ file200 ¦ +---dir3 </code></pre> </blockquote></li> </ul> <h2>解决方案</h2> <h3>程序化方法：</h3> <ol> <li><a href="https://docs.python.org/3/library/os.html#os.listdir" rel="noreferrer">[Python 3]: os.listdir(path='.')</a> <blockquote> Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order, and does not include the special entries <code>'.'</code> and <code>'..'</code> ... </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import os >>> root_dir = "root_dir" # Path relative to current dir (os.getcwd()) >>> >>> os.listdir(root_dir) # List all the items in root_dir ['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1'] >>> >>> [item for item in os.listdir(root_dir) if os.path.isfile(os.path.join(root_dir, item))] # Filter items and only keep files (strip out directories) ['file0', 'file1'] </code></pre> </blockquote> 一个更详细的示例（code\u os\u listdir.py）： <pre class="lang-py prettyprint-override"><code>import os from pprint import pformat def _get_dir_content(path, include_folders, recursive): entries = os.listdir(path) for entry in entries: entry_with_path = os.path.join(path, entry) if os.path.isdir(entry_with_path): if include_folders: yield entry_with_path if recursive: for sub_entry in _get_dir_content(entry_with_path, include_folders, recursive): yield sub_entry else: yield entry_with_path def get_dir_content(path, include_folders=True, recursive=True, prepend_folder_name=True): path_len = len(path) + len(os.path.sep) for item in _get_dir_content(path, include_folders, recursive): yield item if prepend_folder_name else item[path_len:] def _get_dir_content_old(path, include_folders, recursive): entries = os.listdir(path) ret = list() for entry in entries: entry_with_path = os.path.join(path, entry) if os.path.isdir(entry_with_path): if include_folders: ret.append(entry_with_path) if recursive: ret.extend(_get_dir_content_old(entry_with_path, include_folders, recursive)) else: ret.append(entry_with_path) return ret def get_dir_content_old(path, include_folders=True, recursive=True, prepend_folder_name=True): path_len = len(path) + len(os.path.sep) return [item if prepend_folder_name else item[path_len:] for item in _get_dir_content_old(path, include_folders, recursive)] def main(): root_dir = "root_dir" ret0 = get_dir_content(root_dir, include_folders=True, recursive=True, prepend_folder_name=True) lret0 = list(ret0) print(ret0, len(lret0), pformat(lret0)) ret1 = get_dir_content_old(root_dir, include_folders=False, recursive=True, prepend_folder_name=False) print(len(ret1), pformat(ret1)) if __name__ == "__main__": main() </code></pre> 注意： <ul> <li>有两种实现方式： <ul> <li>一个使用生成器的（当然这里看起来没用，因为我马上把结果转换成一个列表）</li> <li>经典的（函数名以\u old结尾）</li> </ul></li> <li>使用递归（进入子目录）</li> <li>对于每个实现，有两个功能： <ul> <li>一个以下划线开头的“private”（不应直接调用）完成所有工作</li> <li>公共路径（包装在前面）：它只是从返回的条目中去掉初始路径（如果需要的话）。这是一个丑陋的实现，但这是我现在唯一能想到的</li> </ul></li> <li>在性能方面，生成器通常要快一点（同时考虑到创建和迭代次），但我没有在递归函数中测试它们，而且我正在函数内部通过内部生成器进行迭代-不知道这对性能有多好</li> <li>玩弄论点以获得不同的结果</li> </ul> 输出： <blockquote> <pre class="lang-py prettyprint-override"><code>(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" "code_os_listdir.py" <generator object get_dir_content at 0x000001BDDBB3DF10> 22 ['root_dir\\dir0', 'root_dir\\dir0\\dir00', 'root_dir\\dir0\\dir00\\dir000', 'root_dir\\dir0\\dir00\\dir000\\file0000', 'root_dir\\dir0\\dir00\\file000', 'root_dir\\dir0\\dir01', 'root_dir\\dir0\\dir01\\file010', 'root_dir\\dir0\\dir01\\file011', 'root_dir\\dir0\\dir02', 'root_dir\\dir0\\dir02\\dir020', 'root_dir\\dir0\\dir02\\dir020\\dir0200', 'root_dir\\dir1', 'root_dir\\dir1\\file10', 'root_dir\\dir1\\file11', 'root_dir\\dir1\\file12', 'root_dir\\dir2', 'root_dir\\dir2\\dir20', 'root_dir\\dir2\\dir20\\file200', 'root_dir\\dir2\\file20', 'root_dir\\dir3', 'root_dir\\file0', 'root_dir\\file1'] 11 ['dir0\\dir00\\dir000\\file0000', 'dir0\\dir00\\file000', 'dir0\\dir01\\file010', 'dir0\\dir01\\file011', 'dir1\\file10', 'dir1\\file11', 'dir1\\file12', 'dir2\\dir20\\file200', 'dir2\\file20', 'file0', 'file1'] </code></pre> </blockquote></li> </ol> <ol start=“2”> <li><a href="https://docs.python.org/3/library/os.html#os.scandir" rel="noreferrer">[Python 3]: os.scandir(path='.')</a>（Python3.5+，后台端口：<a href="https://pypi.org/project/scandir" rel="noreferrer">[PyPI]: scandir</a>） <blockquote> Return an iterator of <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> objects corresponding to the entries in the directory given by path. The entries are yielded in arbitrary order, and the special entries <code>'.'</code> and <code>'..'</code> are not included. Using <a href="https://docs.python.org/3/library/os.html#os.scandir" rel="noreferrer">scandir()</a> instead of <a href="https://docs.python.org/3/library/os.html#os.listdir" rel="noreferrer">listdir()</a> can significantly increase the performance of code that also needs file type or file attribute information, because <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> objects expose this information if the operating system provides it when scanning a directory. All <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> methods may perform a system call, but <a href="https://docs.python.org/3/library/os.html#os.DirEntry.is_dir" rel="noreferrer">is_dir()</a> and <a href="https://docs.python.org/3/library/os.html#os.DirEntry.is_file" rel="noreferrer">is_file()</a> usually only require a system call for symbolic links; <a href="https://docs.python.org/3/library/os.html#os.DirEntry.stat" rel="noreferrer">os.DirEntry.stat()</a> always requires a system call on Unix but only requires one for symbolic links on Windows. </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import os >>> root_dir = os.path.join(".", "root_dir") # Explicitly prepending current directory >>> root_dir '.\\root_dir' >>> >>> scandir_iterator = os.scandir(root_dir) >>> scandir_iterator <nt.ScandirIterator object at 0x00000268CF4BC140> >>> [item.path for item in scandir_iterator] ['.\\root_dir\\dir0', '.\\root_dir\\dir1', '.\\root_dir\\dir2', '.\\root_dir\\dir3', '.\\root_dir\\file0', '.\\root_dir\\file1'] >>> >>> [item.path for item in scandir_iterator] # Will yield an empty list as it was consumed by previous iteration (automatically performed by the list comprehension) [] >>> >>> scandir_iterator = os.scandir(root_dir) # Reinitialize the generator >>> for item in scandir_iterator : ... if os.path.isfile(item.path): ... print(item.name) ... file0 file1 </code></pre> </blockquote> 注意： <ul> <li>它类似于<code>os.listdir</code></li> <li>但它也更灵活（并提供更多功能），更Pythonic（在某些情况下，更快）</li> </ul></li> </ol> <ol start=“3”> <li><a href="https://docs.python.org/3/library/os.html#os.walk" rel="noreferrer">[Python 3]: os.walk(top, topdown=True, onerror=None, followlinks=False)</a> <blockquote> Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (<code>dirpath</code>, <code>dirnames</code>, <code>filenames</code>). </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import os >>> root_dir = os.path.join(os.getcwd(), "root_dir") # Specify the full path >>> root_dir 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir' >>> >>> walk_generator = os.walk(root_dir) >>> root_dir_entry = next(walk_generator) # First entry corresponds to the root dir (passed as an argument) >>> root_dir_entry ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir', ['dir0', 'dir1', 'dir2', 'dir3'], ['file0', 'file1']) >>> >>> root_dir_entry[1] + root_dir_entry[2] # Display dirs and files (direct descendants) in a single list ['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1'] >>> >>> [os.path.join(root_dir_entry[0], item) for item in root_dir_entry[1] + root_dir_entry[2]] # Display all the entries in the previous list by their full path ['E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file1'] >>> >>> for entry in walk_generator: # Display the rest of the elements (corresponding to every subdir) ... print(entry) ... ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', ['dir00', 'dir01', 'dir02'], []) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00', ['dir000'], ['file000']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00\\dir000', [], ['file0000']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir01', [], ['file010', 'file011']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02', ['dir020'], []) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020', ['dir0200'], []) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020\\dir0200', [], []) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', [], ['file10', 'file11', 'file12']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', ['dir20'], ['file20']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2\\dir20', [], ['file200']) ('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', [], []) </code></pre> </blockquote> 注意： <ul> <li>在场景下，它使用<code>os.scandir</code>（在旧版本上使用<code>os.listdir</code>）</li> <li>它可以子文件夹中重复出现的繁重工作</li> </ul></li> </ol> <ol start=“4”> <li><a href="https://docs.python.org/3/library/glob.html#glob.glob" rel="noreferrer">[Python 3]: glob.glob(pathname, *, recursive=False)</a>（<a href="https://docs.python.org/3/library/glob.html#glob.glob" rel="noreferrer">[Python 3]: glob.iglob(pathname, *, recursive=False)</a>） <blockquote> Return a possibly-empty list of path names that match pathname, which must be a string containing a path specification. pathname can be either absolute (like <code>/usr/src/Python-1.5/Makefile</code>) or relative (like <code>../../Tools/*/*.gif</code>), and can contain shell-style wildcards. Broken symlinks are included in the results (as in the shell). ... Changed in version 3.5: Support for recursive globs using “<code>**</code>”. </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import glob, os >>> wildcard_pattern = "*" >>> root_dir = os.path.join("root_dir", wildcard_pattern) # Match every file/dir name >>> root_dir 'root_dir\\*' >>> >>> glob_list = glob.glob(root_dir) >>> glob_list ['root_dir\\dir0', 'root_dir\\dir1', 'root_dir\\dir2', 'root_dir\\dir3', 'root_dir\\file0', 'root_dir\\file1'] >>> >>> [item.replace("root_dir" + os.path.sep, "") for item in glob_list] # Strip the dir name and the path separator from begining ['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1'] >>> >>> for entry in glob.iglob(root_dir + "*", recursive=True): ... print(entry) ... root_dir\ root_dir\dir0 root_dir\dir0\dir00 root_dir\dir0\dir00\dir000 root_dir\dir0\dir00\dir000\file0000 root_dir\dir0\dir00\file000 root_dir\dir0\dir01 root_dir\dir0\dir01\file010 root_dir\dir0\dir01\file011 root_dir\dir0\dir02 root_dir\dir0\dir02\dir020 root_dir\dir0\dir02\dir020\dir0200 root_dir\dir1 root_dir\dir1\file10 root_dir\dir1\file11 root_dir\dir1\file12 root_dir\dir2 root_dir\dir2\dir20 root_dir\dir2\dir20\file200 root_dir\dir2\file20 root_dir\dir3 root_dir\file0 root_dir\file1 </code></pre> </blockquote> 注意： <ul> <li>使用<code>os.listdir</code></li> <li>对于大型树（特别是如果启用了recursive，则首选iglob</li> <li>允许基于名称进行高级筛选（由于使用了通配符）</li> </ul></li> </ol> <ol start=“5”> <li><a href="https://docs.python.org/3/library/pathlib.html#pathlib.Path" rel="noreferrer">[Python 3]: class pathlib.Path(*pathsegments)</a>（Python3.4+，后台端口：<a href="https://pypi.org/project/pathlib2" rel="noreferrer">[PyPI]: pathlib2</a>） <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import pathlib >>> root_dir = "root_dir" >>> root_dir_instance = pathlib.Path(root_dir) >>> root_dir_instance WindowsPath('root_dir') >>> root_dir_instance.name 'root_dir' >>> root_dir_instance.is_dir() True >>> >>> [item.name for item in root_dir_instance.glob("*")] # Wildcard searching for all direct descendants ['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1'] >>> >>> [os.path.join(item.parent.name, item.name) for item in root_dir_instance.glob("*") if not item.is_dir()] # Display paths (including parent) for files only ['root_dir\\file0', 'root_dir\\file1'] </code></pre> </blockquote> 注意： <ul> <li>这是实现我们目标的途径之一</li> <li>这是OOP类型的处理路径</li> <li>提供许多功能</li> </ul></li> </ol> <ol start=“6”> <li><a href="https://docs.python.org/2/library/dircache.html#dircache.listdir" rel="noreferrer">[Python 2]: dircache.listdir(path)</a>（仅限Python2） <ul> <li>但是，根据<a href="https://github.com/python/cpython/blob/2.7/Lib/dircache.py" rel="noreferrer">[GitHub]: python/cpython - (2.7) cpython/Lib/dircache.py</a>，它只是<code>os.listdir</code>上的（薄）包装，带有缓存</li> </ul> <pre class="lang-py prettyprint-override"><code>def listdir(path): """List directory contents, using cache.""" try: cached_mtime, list = cache[path] del cache[path] except KeyError: cached_mtime, list = -1, [] mtime = os.stat(path).st_mtime if mtime != cached_mtime: list = os.listdir(path) list.sort() cache[path] = mtime, list return list </code></pre></li> </ol> <ol start=“7”> <li><a href="http://man7.org/linux/man-pages/man3/opendir.3.html" rel="noreferrer">[man7]: OPENDIR(3)</a>/<a href="http://man7.org/linux/man-pages/man3/readdir.3.html" rel="noreferrer">[man7]: READDIR(3)</a>/<a href="http://man7.org/linux/man-pages/man3/closedir.3.html" rel="noreferrer">[man7]: CLOSEDIR(3)</a>经由<a href="https://docs.python.org/3/library/ctypes.html#module-ctypes" rel="noreferrer">[Python 3]: ctypes - A foreign function library for Python</a>（POSIX特定） <blockquote> <a href="https://docs.python.org/3/library/ctypes.html#module-ctypes" rel="noreferrer">ctypes</a> is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python. </blockquote> 代码类型.py： <pre class="lang-py prettyprint-override"><code>#!/usr/bin/env python3 import sys from ctypes import Structure, \ c_ulonglong, c_longlong, c_ushort, c_ubyte, c_char, c_int, \ CDLL, POINTER, \ create_string_buffer, get_errno, set_errno, cast DT_DIR = 4 DT_REG = 8 char256 = c_char * 256 class LinuxDirent64(Structure): _fields_ = [ ("d_ino", c_ulonglong), ("d_off", c_longlong), ("d_reclen", c_ushort), ("d_type", c_ubyte), ("d_name", char256), ] LinuxDirent64Ptr = POINTER(LinuxDirent64) libc_dll = this_process = CDLL(None, use_errno=True) # ALWAYS set argtypes and restype for functions, otherwise it's UB!!! opendir = libc_dll.opendir readdir = libc_dll.readdir closedir = libc_dll.closedir def get_dir_content(path): ret = [path, list(), list()] dir_stream = opendir(create_string_buffer(path.encode())) if (dir_stream == 0): print("opendir returned NULL (errno: {:d})".format(get_errno())) return ret set_errno(0) dirent_addr = readdir(dir_stream) while dirent_addr: dirent_ptr = cast(dirent_addr, LinuxDirent64Ptr) dirent = dirent_ptr.contents name = dirent.d_name.decode() if dirent.d_type & DT_DIR: if name not in (".", ".."): ret[1].append(name) elif dirent.d_type & DT_REG: ret[2].append(name) dirent_addr = readdir(dir_stream) if get_errno(): print("readdir returned NULL (errno: {:d})".format(get_errno())) closedir(dir_stream) return ret def main(): print("{:s} on {:s}\n".format(sys.version, sys.platform)) root_dir = "root_dir" entries = get_dir_content(root_dir) print(entries) if __name__ == "__main__": main() </code></pre> 注意： <ul> <li>它从libc（在当前进程中加载）加载这三个函数并调用它们（有关更多详细信息，请检查<a href="https://stackoverflow.com/questions/82831/how-do-i-check-whether-a-file-exists-using-python/44661513#44661513">[SO]: How do I check whether a file exists without exceptions? (@CristiFati's answer)</a>-item\4中的最后一个注释）。）。这将使这种方法非常接近于Python的边缘</li> <li>LinuxDirent64是来自<a href="http://man7.org/linux/man-pages/man0/dirent.h.0p.html" rel="noreferrer">[man7]: dirent.h(0P)</a>的结构dirent64的ctypes表示（我的机器的DT常数也是：Ubtu 16 x64（4.10.0-40-generic和libc6 dev:amd64）。在其他类型/版本上，结构定义可能不同，如果是，则应更新ctypes别名，否则将产生未定义的行为</li> <li>它以<code>os.walk</code>的格式返回数据。我不想让它递归，但是从现有的代码开始，这将是一个相当简单的任务</li> <li>在Win上一切都是可行的，数据（库、函数、结构、常量，…）也不同</li> </ul> 输出： <blockquote> <pre class="lang-py prettyprint-override"><code>[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q003207219]> ./code_ctypes.py 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609] on linux ['root_dir', ['dir2', 'dir1', 'dir3', 'dir0'], ['file1', 'file0']] </code></pre> </blockquote></li> </ol> <ol start=“8”> <li><a href="https://docs.activestate.com/activepython/3.1/pywin32/win32file__FindFilesW_meth.html" rel="noreferrer">[ActiveState.Docs]: win32file.FindFilesW</a>（Win特定） <blockquote> Retrieves a list of matching filenames, using the Windows Unicode API. An interface to the API FindFirstFileW/FindNextFileW/Find close functions. </blockquote> <blockquote> <pre class="lang-py prettyprint-override"><code>>>> import os, win32file, win32con >>> root_dir = "root_dir" >>> wildcard = "*" >>> root_dir_wildcard = os.path.join(root_dir, wildcard) >>> entry_list = win32file.FindFilesW(root_dir_wildcard) >>> len(entry_list) # Don't display the whole content as it's too long 8 >>> [entry[-2] for entry in entry_list] # Only display the entry names ['.', '..', 'dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1'] >>> >>> [entry[-2] for entry in entry_list if entry[0] & win32con.FILE_ATTRIBUTE_DIRECTORY and entry[-2] not in (".", "..")] # Filter entries and only display dir names (except self and parent) ['dir0', 'dir1', 'dir2', 'dir3'] >>> >>> [os.path.join(root_dir, entry[-2]) for entry in entry_list if entry[0] & (win32con.FILE_ATTRIBUTE_NORMAL | win32con.FILE_ATTRIBUTE_ARCHIVE)] # Only display file "full" names ['root_dir\\file0', 'root_dir\\file1'] </code></pre> </blockquote> 注意： <ul> <li><code>win32file.FindFilesW</code>是<a href="https://github.com/mhammond/pywin32" rel="noreferrer">[GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions</a>的一部分，它是WINAPIs上的一个Python包装器</li> <li>文档链接来自<a href="https://www.activestate.com" rel="noreferrer">ActiveState</a>，因为我没有找到任何PyWin32官方文档</li> </ul></li> </ol> <ol start=“9”> <li>安装一些（其他）第三方软件包 <ul> <li>很可能，将依赖于上述一个（或多个）（可能有轻微的定制）</li> </ul></li> </ol> 注意： <ul> <li>代码应该是可移植的（除了针对特定区域的位置-已标记）或交叉的： <ul> <li>平台（Nix，Win，）</li> <li>Python版本（2，3，）</li> </ul></li> <li>在上述变体中使用了多种路径样式（绝对路径、相对路径），以说明所使用的“工具”在这个方向上是灵活的</li> <li><code>os.listdir</code>和<code>os.scandir</code>使用opendir/readdir/closedir（<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findfirstfilew" rel="noreferrer">[MS.Docs]: FindFirstFileW function</a>/<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findnextfilew" rel="noreferrer">[MS.Docs]: FindNextFileW function</a>/<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findclose" rel="noreferrer">[MS.Docs]: FindClose function</a>）（通过<a href="https://github.com/python/cpython/blob/master/Modules/posixmodule.c" rel="noreferrer">[GitHub]: python/cpython - (master) cpython/Modules/posixmodule.c</a>）</li> <li><code>win32file.FindFilesW</code>也使用那些（Win特定的）函数（通过<a href="https://github.com/mhammond/pywin32/blob/master/win32/src/win32file.i" rel="noreferrer">[GitHub]: mhammond/pywin32 - (master) pywin32/win32/src/win32file.i</a>）</li> <li>获取目录内容（从点1开始）。）可以使用这些方法中的任何一种来实现（有些需要更多的工作，有些则需要更少的工作） <ul> <li>可以执行一些高级筛选（而不仅仅是文件与目录）：例如，include\folders参数可以替换为另一个参数（例如，filter\u func），该函数将路径作为参数：^{<cd11> }（这不会去掉任何内容）和\u get-dir内容内部，比如：<code>if not filter_func(entry_with_path): continue</code>（如果函数在一个条目上失败，它将被跳过），但是代码越复杂，执行所需的时间就越长</li> </ul></li> <li>不是贝尼！由于使用递归，我必须提到，我在笔记本电脑上做了一些测试（Win 10 x64），与此问题完全无关，并且当递归级别在（990）中的某个地方达到值时。。1000）范围（递归限制-1000（默认值）），得到StackOverflow：）。如果目录树超过了这个限制（我不是FS专家，所以我不知道这是否可能），那可能是个问题。 我还必须提到，我没有试图增加递归极限，因为我在这个领域没有经验（在增加堆栈之前，我可以增加多少），但理论上总是有失败的可能性，如果dir深度大于最大可能的递归极限（在该机器上）</li> <li>代码示例仅用于演示目的。这意味着我没有考虑错误处理（我认为除了最后的块之外，没有任何>尝试/其他/方法），所以代码不健壮（原因是：尽可能简单和简短）。对于生产，还应添加错误处理</li> </ul> <h3>其他方法：</h3> <ol> <li>仅将Python用作包装 <ul> <li>一切都是用另一种技术完成的</li> <li>这项技术是从Python</li> <li>我所知道的最著名的风格是我所称的系统管理员方法： <ul> <li>使用Python（或任何编程语言）执行shell命令（并解析它们的输出）</li> <li>有人认为这是一次巧妙的尝试</li> <li>我认为它更像是一个蹩脚的解决方案（gainarie），因为操作本身是从shell执行的（在本例中是cmd），因此与Python没有任何关系。</li> <li>过滤（<code>grep</code>/<code>findstr</code>）或输出格式可以在两边都做，但我不会坚持这样做。另外，我故意使用<code>os.system</code>，而不是<code>subprocess.Popen</code>。</li> </ul> <blockquote> <pre class="lang-py prettyprint-override"><code>(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os;os.system(\"dir /b root_dir\")" dir0 dir1 dir2 dir3 file0 file1 </code></pre> </blockquote></li> </ul> 一般来说，这种方法是要避免的，因为如果OS版本/风格之间的某些命令输出格式稍有不同，那么解析代码也应该进行调整；更不用说语言环境之间的差异了。</li> </ol>

如何列出目录中的所有文件？

1 个回答

相关Python问题