将Python文件名转换为Unicode

16 投票

6 回答

28026 浏览

数据工程师

提问于 2025-04-15 12:31

我在Windows上使用的是Python 2.6。

我用os.walk来读取文件夹里的文件。文件名中可能会有一些非7位字符（比如德语中的“ae”）。这些字符在Python内部是以特定的字符串格式存储的。

我在处理这些文件名时使用了Python的库函数，但因为编码不对，导致处理失败。

我该如何把这些文件名转换成正确的（unicode？）Python字符串呢？

我有一个文件“d:\utest\ü.txt”。把这个路径作为unicode传递过去并不奏效：

>>> list(os.walk('d:\\utest'))
[('d:\\utest', [], ['\xfc.txt'])]
>>> list(os.walk(u'd:\\utest'))
[(u'd:\\utest', [], [u'\xfc.txt'])]

文件系统字符串处理 os模块非ascii字符编码问题文件名编码 unicode转换

6 个回答

一种更直接的方法是尝试以下步骤——找出你文件系统的编码方式，然后把它转换成unicode格式。例如，

unicode_name = unicode(filename, "utf-8", errors="ignore")

如果你想反过来做，

unicode_name.encode("utf-8")

回答于 2025-04-15 由 Python大师

分享举报

我在找适合Python 3.0及以上版本的解决方案。把它放在这里，以防其他人需要。

rootdir = r'D:\COUNTRY\ROADS\'
fs_enc = sys.getfilesystemencoding()
for (root, dirname, filename) in os.walk(rootdir.encode(fs_enc)):
    # do your stuff here, but remember that now
    # root, dirname, filename are represented as bytearrays

回答于 2025-04-15 由 Python大师

分享举报

如果你把一个Unicode字符串传给os.walk()，你会得到Unicode格式的结果：

>>> list(os.walk(r'C:\example'))          # Passing an ASCII string
[('C:\\example', [], ['file.txt'])]
>>> 
>>> list(os.walk(ur'C:\example'))        # Passing a Unicode string
[(u'C:\\example', [], [u'file.txt'])]

回答于 2025-04-15 由 Python大师

分享举报

将Python文件名转换为Unicode

6 个回答

撰写回答