我使用这个脚本删除重复的行和大于10000的值。你知道吗
import fileinput
import os, sys
import re
def get_immediate_subdirectories(a_dir):
return [name for name in os.listdir(a_dir) if os.path.isdir(os.path.join(a_dir,name))]
for i in get_immediate_subdirectories(os.getcwd()+'\population_vs_time\\'):
for j in get_immediate_subdirectories(os.getcwd()+'\population_vs_time\\'+i):
for file in os.listdir(os.getcwd()+'\population_vs_time\\'+i+'\\'+j):
seen=set()
my_dir=os.getcwd()+'\population_vs_time\\'+str(i)+'\\'+str(j)+'\\'+file
matches=re.match("time",str(file))
if not matches:
print (my_dir)
f = fileinput.input(files=my_dir)
for line in f:
if line in seen: continue # skip duplicate
flag=0
words = line.split()
for word in words:
# try:
i=float(word)
if i>10000:
flag=1
break
# except ValueError:
# flag=1
if flag==1: continue
seen.add(line)
print line, # standard output is now redirected to the file
f.close()
我有一个类型为string的\dir变量,使用print函数来显示这个变量的值,它为前三个文件提供了正确的输出
C:\Program Files (x86)\Guimoo\bin\population_vs_time\mocmaes\frontsize_gen=220\mocmaes_gen_220_100_.csv
C:\Program Files (x86)\Guimoo\bin\population_vs_time\mocmaes\frontsize_gen=220\mocmaes_gen_220_120_.csv
C:\Program Files (x86)\Guimoo\bin\population_vs_time\mocmaes\frontsize_gen=220\mocmaes_gen_220_140_.csv
但当读取下一个文件时,它会给出随机数作为输出(1.70512而不是mocmaes)
C:\Program Files (x86)\Guimoo\bin\population_vs_time\1.70512\frontsize_gen=220\mocmaes_gen_220_160_.csv
我想我缺少python\escape角色的基本知识。是这样吗?你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐