减少d的写入时间

2024-04-20 09:04:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在对CFD数据进行后期处理(对坐标应用旋转)。为此,我做了以下工作:

-读取文件

-将数据存储到结构化数组中

-操作数据(进行计算)

-写入新文件

它可以工作,但每个文件需要7秒。我还有(15000*4)个文件要处理。。。你知道吗

for i in range(0,len(file_count)):
    #Source folder with original files
    os.chdir(path+'\\'+folder_source_location)
    #Generate file names
    file_name = file_source_begin+("%0"+str(ndigit)+"d") % file_count[i]+"_tec.dat"

    #Read the file
    Data = read_tecUNS(file_name)

    #New data set modified
    Data_new = Data

    #Translation
    Data["node"]["X"]+=translator_plane2RotCenter[0]    #The += is important or the Data won't be affected by the translation
    Data["node"]["Y"]+=translator_plane2RotCenter[1]
    Data["node"]["Z"]+=translator_plane2RotCenter[2]

    #Rotation
    Y_temp = Data["node"]["Y"]*cos(theta_rot_rad)-Data["node"]["Z"]*sin(theta_rot_rad)
    Z_temp = Data["node"]["Y"]*sin(theta_rot_rad)+Data["node"]["Z"]*cos(theta_rot_rad)

    Data_new["node"]["Y"]=Y_temp
    Data_new["node"]["Z"]=np.mean(Z_temp)   #Due to rounding, the Z values are not exactly the same. The mean avoid that.

    #Write the new file
    os.chdir(path+'\\'+folder_source_location+'\\'+"Output")
    write_tecplot(file_name,Data_new)

你有什么改进的办法吗?我曾考虑过将文字穿插进去,但我不确定这是否会有什么改进。你知道吗

下面是一个读取/计算/写入时间的示例:

The output folder already exists. The data in it will be erased
StartReading B--0.000018_tec.dat in progress. - 0.001s elapsed
EndReading B--0.000018_tec.dat in progress. - 0.433s elapsed
StartWriting B--0.000018_tec.dat in progress. - 0.435s elapsed
EndWriting B--0.000018_tec.dat in progress. - 7.585s elapsed

StartReading B--0.000036_tec.dat in progress. - 7.586s elapsed
EndReading B--0.000036_tec.dat in progress. - 7.697s elapsed
StartWriting B--0.000036_tec.dat in progress. - 7.697s elapsed
EndWriting B--0.000036_tec.dat in progress. - 13.472s elapsed

还有脚本和一个样本,让你更鲁莽地尝试一下:

http://s000.tinyupload.com/index.php?file_id=80589646527340633700


Tags: 文件theinnodenewdatafoldertemp
1条回答
网友
1楼 · 发布于 2024-04-20 09:04:54

问题不在于写入本身,而在于如何为写入准备和格式化数据。你知道吗

如果您使用python -m cProfile -s cumtime Plane_modifier_rev4-multiple_files.py > out.txt之类的工具来分析脚本,您将看到大部分时间都花在数组格式化上

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.003    0.003   22.297   22.297 Plane_modifier_rev4-multiple_files.py:6(<module>)
        2    0.282    0.141   21.881   10.941 ASCII_TEC.py:101(write_tecplot)
77424/48512    0.091    0.000   21.527    0.000 numeric.py:1681(array_str)
77424/48512    0.424    0.000   21.477    0.000 arrayprint.py:343(array2string)
    48512    0.928    0.000   21.149    0.000 arrayprint.py:233(_array2string)
   145536    0.360    0.000   12.532    0.000 arrayprint.py:533(__init__)
   145536    5.891    0.000   12.172    0.000 arrayprint.py:547(fillFormat)
    48512    0.219    0.000    7.922    0.000 arrayprint.py:700(__init__)
    48512    0.620    0.000    5.623    0.000 arrayprint.py:465(_formatArray)
   170236    2.416    0.000    4.413    0.000 arrayprint.py:598(__call__)
   631546    1.300    0.000    2.933    0.000 numeric.py:2428(seterr)
   434430    2.310    0.000    2.310    0.000 {method 'reduce' of 'numpy.ufunc' objects}
   315773    0.337    0.000    1.941    0.000 numeric.py:2813(__enter__)
   143356    0.234    0.000    1.814    0.000 fromnumeric.py:1772(any)
   315773    0.359    0.000    1.689    0.000 numeric.py:2818(__exit__)
    48512    0.473    0.000    1.268    0.000 arrayprint.py:639(__init__)
   143356    0.157    0.000    1.163    0.000 {method 'any' of 'numpy.ndarray' objects}
   631546    0.967    0.000    1.034    0.000 numeric.py:2524(geterr)
   143356    0.092    0.000    1.006    0.000 _methods.py:37(_any)
   443944    0.763    0.000    0.944    0.000 arrayprint.py:632(_digits)
   143358    0.166    0.000    0.418    0.000 numeric.py:464(asanyarray)
   145536    0.410    0.000    0.410    0.000 {method 'compress' of 'numpy.ndarray' objects}

例如

这个

  for name in names:
        for col_index in range(0,N,5):  #The tecplot data for each variable are saved within 5 columns
            f.write(str(Data["node"][name][col_index:col_index+5])[1:-1]+"\n")
        f.write("\n"+"\n")

可以像这样重写(而且必须更快)

    for name in names:
        n = Data["node"][name]
        for col_index in range(0,N,5):  #The tecplot data for each variable are saved within 5 columns
            vs = n[col_index:col_index+5]
            f.write(",".join([str(v) for v in vs])+"\n")
        f.write("\n"+"\n")

编辑

写技术上的一些变化

def write_tecplot(outfile,Data):
    """
    The expected Data is a dictionary with one structured array: node and one simple array: face
    """
    N = Data["node"].shape[0]   #N is the number of nodes
    E = Data["face"].shape[0]  #E is the number of faces

    #Create the file and the main names
    with open(outfile+'.dat', 'w') as f:
        """ Write HEADER """
        f.write('TITLE = \"title\"\n')
        f.write('VARIABLES  = ')
        #initialize
        names = Data["node"].dtype.names

        #write variable names
        f.write(u'"'+'\",\"'.join(names)+'"\n')
        f.write('ZONE T="tecdata", N=%s, E=%s, ET=QUADRILATERAL, F=FEBLOCK\n\n'%(N,E))

#        Data_number =  len(Data["node"])     #Data_number is the 

        """ WRITE DATA """
        #Write node data
        for name in names:
            n = Data["node"][name]
            for col_index in range(0,N,5):  #The tecplot data for each variable are saved within 5 columns
                f.write(",".join([str(v) for v in n[col_index:col_index+5]])+"\n")
            f.write("\n\n")


        face = Data["face"]
        for col_index in range(0,E,1):  #The tecplot data for each variable are saved within 5 columns
            f.write(",".join([str(v) for v in face[col_index]])+"\n")
        f.write("\n\n")

相关问题 更多 >