为什么shutil.rmtree()这么慢？

6 投票

3 回答

10057 浏览

数据工程师

提问于 2025-04-16 14:38

我去查怎么在Python中删除一个目录，发现可以用 shutil.rmtree() 这个方法。它的速度让我很惊讶，跟我想象中的 rm --recursive 比起来快了不少。有没有更快的办法，除了使用 subprocess 模块呢？

性能优化文件系统文件操作 subprocess模块编程效率 shutil模块目录删除

3 个回答

虽然我不知道具体哪里出问题了，但你可以试试其他方法，比如先把所有文件都删掉，然后再试试这个文件夹。

for r,d,f in os.walk("path"):
   for files in f:
       os.remove ( os.path.join(r,files) )
   os.removedirs( r )

回答于 2025-04-16 由 Python大师

分享举报

如果你在意速度的话：

os.system('rm -fr "%s"' % your_dirname)

除此之外，我发现shutil.rmtree()并没有慢很多……当然，在Python层面上会有一些额外的开销。而且，我只有在你提供合理的数据时，才会相信这样的说法。

回答于 2025-04-16 由 Python大师

分享举报

这个实现做了很多额外的处理：

def rmtree(path, ignore_errors=False, onerror=None):
    """Recursively delete a directory tree.

    If ignore_errors is set, errors are ignored; otherwise, if onerror
    is set, it is called to handle the error with arguments (func,
    path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
    path is the argument to that function that caused it to fail; and
    exc_info is a tuple returned by sys.exc_info(). If ignore_errors
    is false and onerror is None, an exception is raised.

    """
    if ignore_errors:
         def onerror(*args):
              pass
    elif onerror is None:
         def onerror(*args):
              raise
    try:
         if os.path.islink(path):
              # symlinks to directories are forbidden, see bug #1669
              raise OSError("Cannot call rmtree on a symbolic link")
    except OSError:
         onerror(os.path.islink, path, sys.exc_info())
         # can't continue even if onerror hook returns
         return
    names = []
    try:
         names = os.listdir(path)
    except os.error, err:
         onerror(os.listdir, path, sys.exc_info())
    for name in names:
         fullname = os.path.join(path, name)
         try:
              mode = os.lstat(fullname).st_mode
         except os.error:
              mode = 0
         if stat.S_ISDIR(mode):
              rmtree(fullname, ignore_errors, onerror)
         else:
             try:
                 os.remove(fullname)
             except os.error, err:
                 onerror(os.remove, fullname, sys.exc_info())
    try:
         os.rmdir(path)
    except os.error:
         onerror(os.rmdir, path, sys.exc_info())

注意这里用到的 os.path.join() 是用来创建新文件名的；字符串操作是需要时间的。而 rm(1) 的实现则使用了 unlinkat(2) 这个系统调用，它不需要额外的字符串操作。（实际上，这样可以避免内核每次都要遍历整个 namei() 来找到公共目录，这样做会重复很多次。虽然内核的 dentry 缓存很好用，但这仍然会涉及到相当多的内核内字符串处理和比较。）rm(1) 工具可以绕过这些字符串操作，直接使用目录的文件描述符。

此外，rm(1) 和 rmtree() 都会检查树中每个文件和目录的 st_mode；但是 C 语言的实现不需要把每个 struct statbuf 转换成 Python 对象，仅仅为了执行一个简单的整数掩码操作。我不知道这个过程需要多长时间，但它会对目录树中的每个文件、目录、管道、符号链接等执行一次。

回答于 2025-04-16 由 Python大师

分享举报

为什么shutil.rmtree()这么慢？

3 个回答

撰写回答