在文件名中查找和替换
我在运行一个脚本,这个脚本会报错,如果文件名里有 '.' 或 '+' 的话。所以我想写一个脚本,把所有的 '.' 替换成 '_',替换 '+' 的部分没问题。但替换 '.' 的时候遇到麻烦,如果我不把文件分开处理,所有的文件都会被删除!我试着把文件分开处理了,但现在脚本虽然显示运行了,但所有的 '.' 还是在那儿!
这是我的脚本:
folder = "C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1"
import glob, os
for filename in glob.glob(os.path.join(folder, "*+*")):
os.rename(filename, filename.replace('+','_'))
for root, dirs, filenames in os.walk(folder): # returms root, dirs, and files
for filename in filenames:
filename_split = os.path.splitext(filename) # filename and extensionname (extension in [1])
filename_zero = filename_split[0]
extension = str.upper(filename_split[1])
for filename_zero in glob.glob(os.path.join(filename_zero, "*.*")):
os.rename(filename_zero, filename_zero.replace('.','_'))
提前谢谢你们!
2 个回答
你为什么在 os.walk
的循环里使用 glob
,而且还覆盖了 filename_zero
这个变量呢?
for root, dirs, filenames in os.walk(folder):
for filename in filenames:
filename_split = os.path.splitext(filename) # filename and extensionname (extension in [1])
filename_zero = filename_split[0]
extension = filename_split[1].upper()
if "." in filename_zero:
os.rename(filename_zero, filename_zero.replace('.','_'))
(没有测试)
我不太明白你代码里的逻辑。
我加了一些打印语句:
folder = "C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1"
import glob, os
for filename in glob.glob(os.path.join(folder, "*+*")):
print "I rename '+' to '_' in\n"+filename
os.rename(filename, filename.replace('+','_'))
print '\n\n---- Now, there after, are the filenames in \n '+folder
for root, dirs, filenames in os.walk(folder): # returms root, dirs, and files
for filename in filenames:
print '\nfilename==',filename
filename_split = os.path.splitext(filename) # filename and extension name (extension in [1])
filename_zero = filename_split[0]
extension = str.upper(filename_split[1])
print 'filename_zero==',filename_zero
print 'os.path.join(filename_zero, "*.*")==',os.path.join(filename_zero, "*.*")
print 'glob.glob(os.path.join(filename_zero, "*.*"))==',glob.glob(os.path.join(filename_zero, "*.*"))
for filename_zero in glob.glob(os.path.join(filename_zero, "*.*")):
print ' filename_zero in glob.glob(os.path.join(filename_zero, "*.*")) ==',filename_zero
os.rename(filename_zero, filename_zero.replace('.','_'))
这是结果:
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\+po.rt.hos.txt
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\ar.am+is.doc
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\ath+os.html
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\d'a.rtagn+an
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\dum+as.doc
I rename '+' to '_' in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1\ki.kiouili.do+c
---- Now, there after, are the filenames in
C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1
filename== ar.am_is.doc
filename_zero== ar.am_is
os.path.join(filename_zero, "*.*")== ar.am_is\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== arctic.txt
filename_zero== arctic
os.path.join(filename_zero, "*.*")== arctic\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== ath_os.html
filename_zero== ath_os
os.path.join(filename_zero, "*.*")== ath_os\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== atla.ntic.html
filename_zero== atla.ntic
os.path.join(filename_zero, "*.*")== atla.ntic\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== d'a.rtagn_an
filename_zero== d'a
os.path.join(filename_zero, "*.*")== d'a\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== dum_as.doc
filename_zero== dum_as
os.path.join(filename_zero, "*.*")== dum_as\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== ki.kiouili.do_c
filename_zero== ki.kiouili
os.path.join(filename_zero, "*.*")== ki.kiouili\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
filename== _po.rt.hos.txt
filename_zero== _po.rt.hos
os.path.join(filename_zero, "*.*")== _po.rt.hos\*.*
glob.glob(os.path.join(filename_zero, "*.*"))== []
glob.glob(os.path.join(filename_zero, ".")) 总是返回 [],因为 os.path.join(filename_zero, ".") 是一个文件名,而不是路径,所以指令 os.rename(filename_zero, filename_zero.replace('.','_')) 根本没有任何作用。
顺便说一下,我建议你把
for root, dirs, filenames in os.walk(folder):
for filename in filenames:
替换成
for filename in os.listdir(folder):
if os.path.isfile(filename):
或者更好的是(这样缩进少了一层)
for filename in ( f in os.listdir(folder) if os.path.isfile(f) ):
我觉得你走的这条路是个死胡同。如果我理解得没错,你其实是想在文件扩展名前替换掉文件名中的点和加号,也就是说,不想替换掉用来分隔扩展名和文件名的那个点,也不想替换掉扩展名中的加号。总之,扩展名里有点和加号是没道理的。
所以你尝试使用 glob。但就我个人而言,由于 '.' 是用来分隔扩展名的,我不知道 glob 怎么能真正用来实现这个目标。
所以我觉得你应该换个方法。
与其让 glob 检查所有文件名,看它们是否符合通配符模式,然后只返回需要处理的文件名,不如我们直接遍历文件名列表,尝试替换掉扩展名前的 '+' 和 '.'。是的,确实会有一些文件名在这个地方没有点和加号,程序做这些是没意义的。但无论如何,glob 在后台也会做同样的工作。所以,既然都是工作,我更愿意写我能想象出来的代码,也就是不使用 glob。
以下代码在我看来是一个简短而有效的解决方案:
folder = "C:/Documents and Settings/DuffA/Bureaublad/shortcuts projects/klic01/11G008689_1"
import os
separ = os.sep
for n in os.listdir(folder):
print n
if os.path.isfile(folder + separ + n):
filename_zero, extension = os.path.splitext(n)
os.rename(folder + separ + n , folder + separ + filename_zero.replace('.','_').replace('+','_') + extension)
print '\n--------------------------------\n'
for n in os.listdir(folder):
print n
结果:
+po.rt.hos.txt
ar.am+is.doc
arctic.txt
ath+os.html
atla.ntic.html
d'a.rtagn+an
dum+as.doc
ki.kiouili.do+c
--------------------------------
arctic.txt
ar_am_is.doc
ath_os.html
atla_ntic.html
d'a.rtagn+an
dum_as.doc
ki_kiouili.do+c
_po_rt_hos.txt