我从CFD模拟中得到以下数据:
Average value for X = 0.5080000265E-0003 to 0.2489200234E-0001
Z = -.3141592741E+0001
Time = 0.7000032425E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
0.1360714249E-0001 0.2564597656E+0006
0.1663095318E-0001 0.2564346563E+0006
0.1965476200E-0001 0.2564095625E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7010014057E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7020006657E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
从上面的例子中可以看到,数据被标记为Time
的时间步头分割成几个垂直部分。在每个部分中,Y
没有变化,但是P_g
确实发生了变化。为了绘制数据,我需要在下一列中列出每个部分中的P_g
。例如,我需要这样重新创建数据:
使用Pandas,我可以从文本文件中读取数据并创建一个新的数据框架,其中Y
值作为索引(行),而Time
值作为列:
import pandas as pd
# Read in data from text file
# -------------------------------------------------------------------------
# data frame from text file contents, skip first 4 rows, separate by variable
# white space, no header
df = pd.read_table('ROP_s_SD.dat', skiprows=4, sep='\s*', header=None)
# Time data
# -------------------------------------------------------------------------
# data frame of the rows that contain the Time string
dftime = df.loc[df.ix[:,0].str.contains('Time')]
t = dftime[2].tolist() # time list
idx = dftime.index # index of rows containing Time string
# Y data
# -------------------------------------------------------------------------
# grab values for y to create index for new data frame
ido = idx[0]+2 # index of first y value
idf = idx[1] # index of last y value
y = [] # empty list to store y values
for i in range(ido, idf): # iterate through first section of y values
v = df.ix[i, 0] # get y value from data frame
y.append(float(v)) # add y value to y list
# New data frame
# ------------------------------------------------------------------------
# empty data frame with y as index and t as columns
dfnew = pd.DataFrame(None, index=y, columns=t)
print('dfnew is \n', dfnew.head())
空数据帧的头部dfnew.head()
如下所示:
7.000032 7.010014 7.020007 7.030043 7.040020 7.050035 7.060043
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
7.070004 7.080036 7.090022 ... 7.650011 7.660032 7.670026
0.001512 NaN NaN NaN ... NaN NaN NaN
0.004536 NaN NaN NaN ... NaN NaN NaN
0.007560 NaN NaN NaN ... NaN NaN NaN
0.010583 NaN NaN NaN ... NaN NaN NaN
0.013607 NaN NaN NaN ... NaN NaN NaN
7.680044 7.690029 7.700008 7.710012 7.720014 7.730019 7.740026
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
[5 rows x 75 columns]
每列中的NaN
应该包含来自该特定Time
部分的P_g
值。如何将每个节中的P_g
值添加到它们各自的列中?在
我正在阅读的文本文件可以下载here。在
有两件事。首先,也许您可以考虑如何将其简化为二维电子表格。每行应该有哪些列?我建议每行应该包含}。也许这可以告诉你处理奇怪的输入格式的策略。在
Time
、Y
和{第二,您试图为
Y
值绘制P_g
v.s.Time
?您的数据似乎有3个变量,您需要将其减少到2维,以便进行二维打印。是否要为特定的Time
值绘制P_g
的平均值?或者您想要一个3d绘图,在这里为每个Time
值绘制Y
v.s.P_g
?假设您采用了我上面建议的row/col结构,那么对于pandas来说,任何这些都很容易实现。看看熊猫groupby
功能。Here's more detail on that。在编辑:你已经澄清了我的两个问题。试试这个:
看来你已经完成了大部分的辛苦工作。。。以下几行将完成对数据帧的分解:
对于前3列,
^{pr2}$dfnew
的头现在是这样的:您有很多元素,因此查看数据的最佳方式可能是二维:
顺便说一句,在你的示例文件中,每次p峎g的值看起来都是一样的。。。在
相关问题 更多 >
编程相关推荐