<h2>初步说明</h2>
<ul>
<li>尽管问题文本中的<em>文件</em>和<em>目录</em>术语之间有明显的区别,但有些人可能认为目录实际上是特殊文件</li>
<li>语句:“<em>目录的所有文件</em>”可以用两种方式解释:
<ol>
<li>所有<strong>直接的</strong>(或级别1)子代<strong>仅</strong></li>
<li>整个目录树中的所有子目录(包括子目录中的子目录)</li>
</ol></li>
<li><p>当被问到这个问题时,我认为Python<em>Python<strong>2</strong></em>是<em>LTS</em>版本,但是代码示例将由<em>Python<strong>3</strong>(<strong>.5</strong>)</em>运行(我将尽可能使它们与Python 2</em>兼容;而且,我要发布的任何属于<em>Python</em>的代码都是从<strong>v3.5.4</strong>-除非另有规定)。这与问题中的另一个关键字有关:“<em>将它们添加到<strong>列表中”</strong></em>“:</p>
<ul>
<li>在pre<em>Python 2.2</em>版本中,序列(iterable)主要由列表(元组、集合等)表示</li>
<li>在Python 2.2</em>中,引入了生成器的概念(由<a href="https://docs.python.org/3/reference/simple_stmts.html#the-yield-statement" rel="noreferrer">[Python 3]: The yield statement</a>提供)。随着时间的推移,对于返回/使用列表的函数,生成器对应项开始出现</li>
<li>在<em>Python 3</em>中,generator是默认行为</li>
<li>不确定返回列表是否仍然是必需的(或者生成器也可以),但是将生成器传递给<em>list</em>构造函数,将从中创建一个列表(同时使用它)。下面的示例说明了<a href="https://docs.python.org/3/library/functions.html#map" rel="noreferrer">[Python 3]: <strong>map</strong>(<em>function, iterable, ...</em>)</a>上的差异</li>
</ul>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import sys
>>> sys.version
'2.7.10 (default, Mar 8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]'
>>> m = map(lambda x: x, [1, 2, 3]) # Just a dummy lambda function
>>> m, type(m)
([1, 2, 3], <type 'list'>)
>>> len(m)
3
</code></pre>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import sys
>>> sys.version
'3.5.4 (v3.5.4:3f56838, Aug 8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)]'
>>> m = map(lambda x: x, [1, 2, 3])
>>> m, type(m)
(<map object at 0x000001B4257342B0>, <class 'map'>)
>>> len(m)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'map' has no len()
>>> lm0 = list(m) # Build a list from the generator
>>> lm0, type(lm0)
([1, 2, 3], <class 'list'>)
>>>
>>> lm1 = list(m) # Build a list from the same generator
>>> lm1, type(lm1) # Empty list now - generator already consumed
([], <class 'list'>)
</code></pre>
</blockquote></li>
<li><p>这些示例将基于一个名为<em>root\u dir</em>的目录,该目录具有以下结构(此示例用于<em>Win</em>,但我在<em>Lnx</em>上使用相同的树):</p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>E:\Work\Dev\StackOverflow\q003207219>tree /f "root_dir"
Folder PATH listing for volume Work
Volume serial number is 00000029 3655:6FED
E:\WORK\DEV\STACKOVERFLOW\Q003207219\ROOT_DIR
¦ file0
¦ file1
¦
+---dir0
¦ +---dir00
¦ ¦ ¦ file000
¦ ¦ ¦
¦ ¦ +---dir000
¦ ¦ file0000
¦ ¦
¦ +---dir01
¦ ¦ file010
¦ ¦ file011
¦ ¦
¦ +---dir02
¦ +---dir020
¦ +---dir0200
+---dir1
¦ file10
¦ file11
¦ file12
¦
+---dir2
¦ ¦ file20
¦ ¦
¦ +---dir20
¦ file200
¦
+---dir3
</code></pre>
</blockquote></li>
</ul>
<p><br/></p>
<h2>解决方案</h2>
<h3>程序化方法:</h3>
<ol>
<li><p><a href="https://docs.python.org/3/library/os.html#os.listdir" rel="noreferrer">[Python 3]: os.<strong>listdir</strong>(<em>path='.'</em>)</a></p>
<blockquote>
<p>Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order, and does not include the special entries <code>'.'</code> and <code>'..'</code> ...</p>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import os
>>> root_dir = "root_dir" # Path relative to current dir (os.getcwd())
>>>
>>> os.listdir(root_dir) # List all the items in root_dir
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [item for item in os.listdir(root_dir) if os.path.isfile(os.path.join(root_dir, item))] # Filter items and only keep files (strip out directories)
['file0', 'file1']
</code></pre>
</blockquote>
<p>一个更详细的示例(<em>code\u os\u listdir.py</em>):</p>
<pre class="lang-py prettyprint-override"><code>import os
from pprint import pformat
def _get_dir_content(path, include_folders, recursive):
entries = os.listdir(path)
for entry in entries:
entry_with_path = os.path.join(path, entry)
if os.path.isdir(entry_with_path):
if include_folders:
yield entry_with_path
if recursive:
for sub_entry in _get_dir_content(entry_with_path, include_folders, recursive):
yield sub_entry
else:
yield entry_with_path
def get_dir_content(path, include_folders=True, recursive=True, prepend_folder_name=True):
path_len = len(path) + len(os.path.sep)
for item in _get_dir_content(path, include_folders, recursive):
yield item if prepend_folder_name else item[path_len:]
def _get_dir_content_old(path, include_folders, recursive):
entries = os.listdir(path)
ret = list()
for entry in entries:
entry_with_path = os.path.join(path, entry)
if os.path.isdir(entry_with_path):
if include_folders:
ret.append(entry_with_path)
if recursive:
ret.extend(_get_dir_content_old(entry_with_path, include_folders, recursive))
else:
ret.append(entry_with_path)
return ret
def get_dir_content_old(path, include_folders=True, recursive=True, prepend_folder_name=True):
path_len = len(path) + len(os.path.sep)
return [item if prepend_folder_name else item[path_len:] for item in _get_dir_content_old(path, include_folders, recursive)]
def main():
root_dir = "root_dir"
ret0 = get_dir_content(root_dir, include_folders=True, recursive=True, prepend_folder_name=True)
lret0 = list(ret0)
print(ret0, len(lret0), pformat(lret0))
ret1 = get_dir_content_old(root_dir, include_folders=False, recursive=True, prepend_folder_name=False)
print(len(ret1), pformat(ret1))
if __name__ == "__main__":
main()
</code></pre>
<p><strong>注意:</p>
<ul>
<li>有两种实现方式:
<ul>
<li>一个使用生成器的(当然这里看起来没用,因为我马上把结果转换成一个列表)</li>
<li>经典的(函数名以<strong>\u old</strong>结尾)</li>
</ul></li>
<li>使用递归(进入子目录)</li>
<li>对于每个实现,有两个功能:
<ul>
<li>一个以下划线开头的“private”(不应直接调用)完成所有工作</li>
<li>公共路径(包装在前面):它只是从返回的条目中去掉初始路径(如果需要的话)。这是一个丑陋的实现,但这是我现在唯一能想到的</li>
</ul></li>
<li>在性能方面,生成器通常要快一点(同时考虑到<em>创建</em>和<em>迭代</em>次),但我没有在递归函数中测试它们,而且我正在函数内部通过内部生成器进行迭代-不知道这对性能有多好</li>
<li>玩弄论点以获得不同的结果</li>
</ul>
<p><br/></p>
<p><strong>输出</strong>:</p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" "code_os_listdir.py"
<generator object get_dir_content at 0x000001BDDBB3DF10> 22 ['root_dir\\dir0',
'root_dir\\dir0\\dir00',
'root_dir\\dir0\\dir00\\dir000',
'root_dir\\dir0\\dir00\\dir000\\file0000',
'root_dir\\dir0\\dir00\\file000',
'root_dir\\dir0\\dir01',
'root_dir\\dir0\\dir01\\file010',
'root_dir\\dir0\\dir01\\file011',
'root_dir\\dir0\\dir02',
'root_dir\\dir0\\dir02\\dir020',
'root_dir\\dir0\\dir02\\dir020\\dir0200',
'root_dir\\dir1',
'root_dir\\dir1\\file10',
'root_dir\\dir1\\file11',
'root_dir\\dir1\\file12',
'root_dir\\dir2',
'root_dir\\dir2\\dir20',
'root_dir\\dir2\\dir20\\file200',
'root_dir\\dir2\\file20',
'root_dir\\dir3',
'root_dir\\file0',
'root_dir\\file1']
11 ['dir0\\dir00\\dir000\\file0000',
'dir0\\dir00\\file000',
'dir0\\dir01\\file010',
'dir0\\dir01\\file011',
'dir1\\file10',
'dir1\\file11',
'dir1\\file12',
'dir2\\dir20\\file200',
'dir2\\file20',
'file0',
'file1']
</code></pre>
</blockquote></li>
</ol>
<p><br/></p>
<ol start=“2”>
<li><p><a href="https://docs.python.org/3/library/os.html#os.scandir" rel="noreferrer">[Python 3]: os.<strong>scandir</strong>(<em>path='.'</em>)</a>(<em>Python<strong>3.5</strong></em>+,后台端口:<a href="https://pypi.org/project/scandir" rel="noreferrer">[PyPI]: scandir</a>)</p>
<blockquote>
<p>Return an iterator of <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> objects corresponding to the entries in the directory given by <em>path</em>. The entries are yielded in arbitrary order, and the special entries <code>'.'</code> and <code>'..'</code> are not included.</p>
<p>Using <a href="https://docs.python.org/3/library/os.html#os.scandir" rel="noreferrer">scandir()</a> instead of <a href="https://docs.python.org/3/library/os.html#os.listdir" rel="noreferrer">listdir()</a> can significantly increase the performance of code that also needs file type or file attribute information, because <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> objects expose this information if the operating system provides it when scanning a directory. All <a href="https://docs.python.org/3/library/os.html#os.DirEntry" rel="noreferrer">os.DirEntry</a> methods may perform a system call, but <a href="https://docs.python.org/3/library/os.html#os.DirEntry.is_dir" rel="noreferrer">is_dir()</a> and <a href="https://docs.python.org/3/library/os.html#os.DirEntry.is_file" rel="noreferrer">is_file()</a> usually only require a system call for symbolic links; <a href="https://docs.python.org/3/library/os.html#os.DirEntry.stat" rel="noreferrer">os.DirEntry.stat()</a> always requires a system call on Unix but only requires one for symbolic links on Windows.</p>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import os
>>> root_dir = os.path.join(".", "root_dir") # Explicitly prepending current directory
>>> root_dir
'.\\root_dir'
>>>
>>> scandir_iterator = os.scandir(root_dir)
>>> scandir_iterator
<nt.ScandirIterator object at 0x00000268CF4BC140>
>>> [item.path for item in scandir_iterator]
['.\\root_dir\\dir0', '.\\root_dir\\dir1', '.\\root_dir\\dir2', '.\\root_dir\\dir3', '.\\root_dir\\file0', '.\\root_dir\\file1']
>>>
>>> [item.path for item in scandir_iterator] # Will yield an empty list as it was consumed by previous iteration (automatically performed by the list comprehension)
[]
>>>
>>> scandir_iterator = os.scandir(root_dir) # Reinitialize the generator
>>> for item in scandir_iterator :
... if os.path.isfile(item.path):
... print(item.name)
...
file0
file1
</code></pre>
</blockquote>
<p><strong>注意:</p>
<ul>
<li>它类似于<code>os.listdir</code></li>
<li>但它也更灵活(并提供更多功能),更<em>Python</em>ic(在某些情况下,更快)</li>
</ul></li>
</ol>
<p><br/></p>
<ol start=“3”>
<li><p><a href="https://docs.python.org/3/library/os.html#os.walk" rel="noreferrer">[Python 3]: os.<strong>walk</strong>(<em>top, topdown=True, onerror=None, followlinks=False</em>)</a></p>
<blockquote>
<p>Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory <em>top</em> (including <em>top</em> itself), it yields a 3-tuple (<code>dirpath</code>, <code>dirnames</code>, <code>filenames</code>).</p>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import os
>>> root_dir = os.path.join(os.getcwd(), "root_dir") # Specify the full path
>>> root_dir
'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir'
>>>
>>> walk_generator = os.walk(root_dir)
>>> root_dir_entry = next(walk_generator) # First entry corresponds to the root dir (passed as an argument)
>>> root_dir_entry
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir', ['dir0', 'dir1', 'dir2', 'dir3'], ['file0', 'file1'])
>>>
>>> root_dir_entry[1] + root_dir_entry[2] # Display dirs and files (direct descendants) in a single list
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [os.path.join(root_dir_entry[0], item) for item in root_dir_entry[1] + root_dir_entry[2]] # Display all the entries in the previous list by their full path
['E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file0', 'E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\file1']
>>>
>>> for entry in walk_generator: # Display the rest of the elements (corresponding to every subdir)
... print(entry)
...
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0', ['dir00', 'dir01', 'dir02'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00', ['dir000'], ['file000'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir00\\dir000', [], ['file0000'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir01', [], ['file010', 'file011'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02', ['dir020'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020', ['dir0200'], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir0\\dir02\\dir020\\dir0200', [], [])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir1', [], ['file10', 'file11', 'file12'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2', ['dir20'], ['file20'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir2\\dir20', [], ['file200'])
('E:\\Work\\Dev\\StackOverflow\\q003207219\\root_dir\\dir3', [], [])
</code></pre>
</blockquote>
<p><strong>注意:</p>
<ul>
<li>在场景下,它使用<code>os.scandir</code>(在旧版本上使用<code>os.listdir</code>)</li>
<li>它可以子文件夹中重复出现的繁重工作</li>
</ul></li>
</ol>
<p><br/></p>
<ol start=“4”>
<li><p><a href="https://docs.python.org/3/library/glob.html#glob.glob" rel="noreferrer">[Python 3]: glob.<strong>glob</strong>(<em>pathname, *, recursive=False</em>)</a>(<a href="https://docs.python.org/3/library/glob.html#glob.glob" rel="noreferrer">[Python 3]: glob.<strong>iglob</strong>(<em>pathname, *, recursive=False</em>)</a>)</p>
<blockquote>
<p>Return a possibly-empty list of path names that match <em>pathname</em>, which must be a string containing a path specification. <em>pathname</em> can be either absolute (like <code>/usr/src/Python-1.5/Makefile</code>) or relative (like <code>../../Tools/*/*.gif</code>), and can contain shell-style wildcards. Broken symlinks are included in the results (as in the shell).<br/>...<br/><strong><em>Changed in version 3.5</em></strong>: Support for recursive globs using “<code>**</code>”.</p>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import glob, os
>>> wildcard_pattern = "*"
>>> root_dir = os.path.join("root_dir", wildcard_pattern) # Match every file/dir name
>>> root_dir
'root_dir\\*'
>>>
>>> glob_list = glob.glob(root_dir)
>>> glob_list
['root_dir\\dir0', 'root_dir\\dir1', 'root_dir\\dir2', 'root_dir\\dir3', 'root_dir\\file0', 'root_dir\\file1']
>>>
>>> [item.replace("root_dir" + os.path.sep, "") for item in glob_list] # Strip the dir name and the path separator from begining
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> for entry in glob.iglob(root_dir + "*", recursive=True):
... print(entry)
...
root_dir\
root_dir\dir0
root_dir\dir0\dir00
root_dir\dir0\dir00\dir000
root_dir\dir0\dir00\dir000\file0000
root_dir\dir0\dir00\file000
root_dir\dir0\dir01
root_dir\dir0\dir01\file010
root_dir\dir0\dir01\file011
root_dir\dir0\dir02
root_dir\dir0\dir02\dir020
root_dir\dir0\dir02\dir020\dir0200
root_dir\dir1
root_dir\dir1\file10
root_dir\dir1\file11
root_dir\dir1\file12
root_dir\dir2
root_dir\dir2\dir20
root_dir\dir2\dir20\file200
root_dir\dir2\file20
root_dir\dir3
root_dir\file0
root_dir\file1
</code></pre>
</blockquote>
<p><strong>注意:</p>
<ul>
<li>使用<code>os.listdir</code></li>
<li>对于大型树(特别是如果启用了<em>recursive</em>,则首选<em>iglob</em></li>
<li>允许基于名称进行高级筛选(由于使用了通配符)</li>
</ul></li>
</ol>
<p><br/></p>
<ol start=“5”>
<li><p><a href="https://docs.python.org/3/library/pathlib.html#pathlib.Path" rel="noreferrer">[Python 3]: class pathlib.<strong>Path</strong>(<em>*pathsegments</em>)</a>(<em>Python<strong>3.4</strong></em>+,后台端口:<a href="https://pypi.org/project/pathlib2" rel="noreferrer">[PyPI]: pathlib2</a>)</p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import pathlib
>>> root_dir = "root_dir"
>>> root_dir_instance = pathlib.Path(root_dir)
>>> root_dir_instance
WindowsPath('root_dir')
>>> root_dir_instance.name
'root_dir'
>>> root_dir_instance.is_dir()
True
>>>
>>> [item.name for item in root_dir_instance.glob("*")] # Wildcard searching for all direct descendants
['dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [os.path.join(item.parent.name, item.name) for item in root_dir_instance.glob("*") if not item.is_dir()] # Display paths (including parent) for files only
['root_dir\\file0', 'root_dir\\file1']
</code></pre>
</blockquote>
<p><strong>注意:</p>
<ul>
<li>这是实现我们目标的途径之一</li>
<li>这是<em>OOP</em>类型的处理路径</li>
<li>提供许多功能</li>
</ul></li>
</ol>
<p><br/></p>
<ol start=“6”>
<li><p><a href="https://docs.python.org/2/library/dircache.html#dircache.listdir" rel="noreferrer">[Python 2]: dircache.listdir(path)</a>(仅限<em>Python<strong>2</strong></em>)</p>
<ul>
<li>但是,根据<a href="https://github.com/python/cpython/blob/2.7/Lib/dircache.py" rel="noreferrer">[GitHub]: python/cpython - (2.7) cpython/Lib/dircache.py</a>,它只是<code>os.listdir</code>上的(薄)包装,带有缓存</li>
</ul>
<p><br/></p>
<pre class="lang-py prettyprint-override"><code>def listdir(path):
"""List directory contents, using cache."""
try:
cached_mtime, list = cache[path]
del cache[path]
except KeyError:
cached_mtime, list = -1, []
mtime = os.stat(path).st_mtime
if mtime != cached_mtime:
list = os.listdir(path)
list.sort()
cache[path] = mtime, list
return list
</code></pre></li>
</ol>
<p><br/></p>
<ol start=“7”>
<li><p><a href="http://man7.org/linux/man-pages/man3/opendir.3.html" rel="noreferrer">[man7]: OPENDIR(3)</a>/<a href="http://man7.org/linux/man-pages/man3/readdir.3.html" rel="noreferrer">[man7]: READDIR(3)</a>/<a href="http://man7.org/linux/man-pages/man3/closedir.3.html" rel="noreferrer">[man7]: CLOSEDIR(3)</a>经由<a href="https://docs.python.org/3/library/ctypes.html#module-ctypes" rel="noreferrer">[Python 3]: ctypes - A foreign function library for Python</a>(<em>POSIX</em>特定)</p>
<blockquote>
<p><a href="https://docs.python.org/3/library/ctypes.html#module-ctypes" rel="noreferrer">ctypes</a> is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.</p>
</blockquote>
<p><em>代码类型.py</em>:</p>
<pre class="lang-py prettyprint-override"><code>#!/usr/bin/env python3
import sys
from ctypes import Structure, \
c_ulonglong, c_longlong, c_ushort, c_ubyte, c_char, c_int, \
CDLL, POINTER, \
create_string_buffer, get_errno, set_errno, cast
DT_DIR = 4
DT_REG = 8
char256 = c_char * 256
class LinuxDirent64(Structure):
_fields_ = [
("d_ino", c_ulonglong),
("d_off", c_longlong),
("d_reclen", c_ushort),
("d_type", c_ubyte),
("d_name", char256),
]
LinuxDirent64Ptr = POINTER(LinuxDirent64)
libc_dll = this_process = CDLL(None, use_errno=True)
# ALWAYS set argtypes and restype for functions, otherwise it's UB!!!
opendir = libc_dll.opendir
readdir = libc_dll.readdir
closedir = libc_dll.closedir
def get_dir_content(path):
ret = [path, list(), list()]
dir_stream = opendir(create_string_buffer(path.encode()))
if (dir_stream == 0):
print("opendir returned NULL (errno: {:d})".format(get_errno()))
return ret
set_errno(0)
dirent_addr = readdir(dir_stream)
while dirent_addr:
dirent_ptr = cast(dirent_addr, LinuxDirent64Ptr)
dirent = dirent_ptr.contents
name = dirent.d_name.decode()
if dirent.d_type & DT_DIR:
if name not in (".", ".."):
ret[1].append(name)
elif dirent.d_type & DT_REG:
ret[2].append(name)
dirent_addr = readdir(dir_stream)
if get_errno():
print("readdir returned NULL (errno: {:d})".format(get_errno()))
closedir(dir_stream)
return ret
def main():
print("{:s} on {:s}\n".format(sys.version, sys.platform))
root_dir = "root_dir"
entries = get_dir_content(root_dir)
print(entries)
if __name__ == "__main__":
main()
</code></pre>
<p><strong>注意:</p>
<ul>
<li>它从<em>libc</em>(在当前进程中加载)加载这三个函数并调用它们(有关更多详细信息,请检查<a href="https://stackoverflow.com/questions/82831/how-do-i-check-whether-a-file-exists-using-python/44661513#44661513">[SO]: How do I check whether a file exists without exceptions? (@CristiFati's answer)</a>-item<strong><em>\4中的最后一个注释)。</em></strong>)。这将使这种方法非常接近于Python的边缘</li>
<li><em>LinuxDirent64</em>是来自<a href="http://man7.org/linux/man-pages/man0/dirent.h.0p.html" rel="noreferrer">[man7]: dirent.h(0P)</a>的<em>结构dirent64</em>的<em>ctypes</em>表示(我的机器的<em>DT</em>常数也是:<em>Ubtu 16 x64</em>(<em>4.10.0-40-generic</em>和<em>libc6 dev:amd64</em>)。在其他类型/版本上,结构定义可能不同,如果是,则应更新<em>ctypes</em>别名,否则将产生<strong>未定义的行为</strong></li>
<li>它以<code>os.walk</code>的格式返回数据。我不想让它递归,但是从现有的代码开始,这将是一个相当简单的任务</li>
<li>在Win</em>上一切都是可行的,数据(库、函数、结构、常量,…)也不同</li>
</ul>
<p><br/></p>
<p><strong>输出</strong>:</p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>[cfati@cfati-ubtu16x64-0:~/Work/Dev/StackOverflow/q003207219]> ./code_ctypes.py
3.5.2 (default, Nov 12 2018, 13:43:14)
[GCC 5.4.0 20160609] on linux
['root_dir', ['dir2', 'dir1', 'dir3', 'dir0'], ['file1', 'file0']]
</code></pre>
</blockquote></li>
</ol>
<p><br/></p>
<ol start=“8”>
<li><p><a href="https://docs.activestate.com/activepython/3.1/pywin32/win32file__FindFilesW_meth.html" rel="noreferrer">[ActiveState.Docs]: win32file.FindFilesW</a>(<em>Win</em>特定)</p>
<blockquote>
<p>Retrieves a list of matching filenames, using the Windows Unicode API. An interface to the API FindFirstFileW/FindNextFileW/Find close functions.</p>
</blockquote>
<p><br/></p>
<blockquote>
<pre class="lang-py prettyprint-override"><code>>>> import os, win32file, win32con
>>> root_dir = "root_dir"
>>> wildcard = "*"
>>> root_dir_wildcard = os.path.join(root_dir, wildcard)
>>> entry_list = win32file.FindFilesW(root_dir_wildcard)
>>> len(entry_list) # Don't display the whole content as it's too long
8
>>> [entry[-2] for entry in entry_list] # Only display the entry names
['.', '..', 'dir0', 'dir1', 'dir2', 'dir3', 'file0', 'file1']
>>>
>>> [entry[-2] for entry in entry_list if entry[0] & win32con.FILE_ATTRIBUTE_DIRECTORY and entry[-2] not in (".", "..")] # Filter entries and only display dir names (except self and parent)
['dir0', 'dir1', 'dir2', 'dir3']
>>>
>>> [os.path.join(root_dir, entry[-2]) for entry in entry_list if entry[0] & (win32con.FILE_ATTRIBUTE_NORMAL | win32con.FILE_ATTRIBUTE_ARCHIVE)] # Only display file "full" names
['root_dir\\file0', 'root_dir\\file1']
</code></pre>
</blockquote>
<p><strong>注意:</p>
<ul>
<li><code>win32file.FindFilesW</code>是<a href="https://github.com/mhammond/pywin32" rel="noreferrer">[GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions</a>的一部分,它是<em>WINAPI</em>s上的一个<em>Python</em>包装器</li>
<li>文档链接来自<a href="https://www.activestate.com" rel="noreferrer">ActiveState</a>,因为我没有找到任何<em>PyWin32</em>官方文档</li>
</ul></li>
</ol>
<p><br/></p>
<ol start=“9”>
<li>安装一些(其他)第三方软件包
<ul>
<li>很可能,将依赖于上述一个(或多个)(可能有轻微的定制)</li>
</ul></li>
</ol>
<p><br/></p>
<p><strong>注意:</p>
<ul>
<li><p>代码应该是可移植的(除了针对特定区域的位置-已标记)或交叉的:</p>
<ul>
<li>平台(<em>Nix</em>,<em>Win</em>,)</li>
<li><em>Python</em>版本(2,3,)</li>
</ul></li>
<li><p>在上述变体中使用了多种路径样式(绝对路径、相对路径),以说明所使用的“工具”在这个方向上是灵活的</p></li>
<li><p><code>os.listdir</code>和<code>os.scandir</code>使用<em>opendir</em>/<em>readdir</em>/<em>closedir</em>(<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findfirstfilew" rel="noreferrer">[MS.Docs]: FindFirstFileW function</a>/<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findnextfilew" rel="noreferrer">[MS.Docs]: FindNextFileW function</a>/<a href="https://docs.microsoft.com/en-gb/windows/desktop/api/fileapi/nf-fileapi-findclose" rel="noreferrer">[MS.Docs]: FindClose function</a>)(通过<a href="https://github.com/python/cpython/blob/master/Modules/posixmodule.c" rel="noreferrer">[GitHub]: python/cpython - (master) cpython/Modules/posixmodule.c</a>)</p></li>
<li><p><code>win32file.FindFilesW</code>也使用那些(<em>Win</em>特定的)函数(通过<a href="https://github.com/mhammond/pywin32/blob/master/win32/src/win32file.i" rel="noreferrer">[GitHub]: mhammond/pywin32 - (master) pywin32/win32/src/win32file.i</a>)</p></li>
<li><p><em>获取目录内容(从点1开始)。</em></strong>)可以使用这些方法中的任何一种来实现(有些需要更多的工作,有些则需要更少的工作)</p>
<ul>
<li>可以执行一些高级筛选(而不仅仅是文件<em>与</em>目录):例如,<em>include\folders</em>参数可以替换为另一个参数(例如,<em>filter\u func</em>),该函数将路径作为参数:^{<cd11> }(这不会去掉任何内容)和<em>\u get-dir内容</em>内部,比如:<code>if not filter_func(entry_with_path): continue</code>(如果函数在一个条目上失败,它将被跳过),但是代码越复杂,执行所需的时间就越长</li>
</ul></li>
<li><p><strong>不是贝尼!</strong>由于使用递归,我必须提到,我在笔记本电脑上做了一些测试(<em>Win 10 x64</em>),与此问题完全无关,并且当递归级别在<em>(990)中的某个地方达到值时。。1000)</em>范围(<em>递归限制</em>-1000(默认值)),得到<em>StackOverflow</em>:)。如果目录树超过了这个限制(我不是<em>FS</em>专家,所以我不知道这是否可能),那可能是个问题。<br/>
我还必须提到,我没有试图增加递归极限,因为我在这个领域没有经验(在增加堆栈之前,我可以增加多少),但理论上总是有失败的可能性,如果dir深度大于最大可能的<em>递归极限</em>(在该机器上)</p></li>
<li><p>代码示例仅用于演示目的。这意味着我没有考虑错误处理(我认为除了最后的块之外,没有任何</em>><em>尝试</em></strong>/<em><em>其他</em></strong>/<em>方法),所以代码不健壮(原因是:尽可能简单和简短)。对于<em>生产</em>,还应添加错误处理</p></li>
</ul>
<h3>其他方法:</h3>
<ol>
<li><p>仅将Python</em>用作包装</p>
<ul>
<li>一切都是用另一种技术完成的</li>
<li>这项技术是从<em>Python</em></li>
<li><p>我所知道的最著名的风格是我所称的<em>系统管理员</em>方法:</p>
<ul>
<li>使用<em>Python</em>(或任何编程语言)执行<em>shell</em>命令(并解析它们的输出)</li>
<li>有人认为这是一次巧妙的尝试</li>
<li>我认为它更像是一个蹩脚的解决方案(<em>gainarie</em>),因为操作本身是从<em>shell</em>执行的(在本例中是<em>cmd</em>),因此与<em>Python</em>没有任何关系。</li>
<li>过滤(<code>grep</code>/<code>findstr</code>)或输出格式可以在两边都做,但我不会坚持这样做。另外,我故意使用<code>os.system</code>,而不是<code>subprocess.Popen</code>。</li>
</ul>
<blockquote>
<pre class="lang-py prettyprint-override"><code>(py35x64_test) E:\Work\Dev\StackOverflow\q003207219>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os;os.system(\"dir /b root_dir\")"
dir0
dir1
dir2
dir3
file0
file1
</code></pre>
</blockquote></li>
</ul>
<p>一般来说,这种方法是要避免的,因为如果<em>OS</em>版本/风格之间的某些命令输出格式稍有不同,那么解析代码也应该进行调整;更不用说语言环境之间的差异了。</p></li>
</ol>