如何仅使用python从文件中提取数据

# GROMACS # @ title "GROMACS Energies" @ xaxis label "Time (ps)" @ yaxis label "(K)" @TYPE xy @ view 0.15, 0.15, 0.75, 0.85 @ legend on @ legend box on @ legend loctype view @ legend 0.78, 0.8 @ legend length 2 @ s0 legend "Temperature" 0.000000 301.204895 1.000000 299.083496 2.000000 293.100250 3.000000 301.090637 4.000000 293.024811 5.000000 297.068481 6.000000 298.065125 7.000000 300.354370 8.000000 304.322693 9.000000 297.093170 10.000000 297.186615 11.000000 298.112732 12.000000 293.396545 13.000000 295.803162 14.000000 293.432037 15.000000 298.306702 16.000000 297.545715 17.000000 294.283875 18.000000 295.527771 19.000000 297.193665

3条回答

网友

1楼 · 编辑于 2024-05-29 00:04:42

在python中，等效程序为：

import re
# Using readlines()
file1 = open('temp.xvg', 'r')
Lines = file1.readlines()
 
count = 0
# Strips the newline character
for line in Lines:
    if count==0:
        x=re.search("^@ s0 legend",line)
        if x: # FOUND!!!
            count += 1

    else:
        print("{}".format(line.strip()))

您可以将其另存为program.py。然后执行：

python2 program.py

您将获得以下输出：

0.000000  301.204895
1.000000  299.083496
2.000000  293.100250
3.000000  301.090637
4.000000  293.024811
5.000000  297.068481
6.000000  298.065125
7.000000  300.354370
8.000000  304.322693
9.000000  297.093170
10.000000  297.186615
11.000000  298.112732
12.000000  293.396545
13.000000  295.803162
14.000000  293.432037
15.000000  298.306702
16.000000  297.545715
17.000000  294.283875
18.000000  295.527771
19.000000  297.193665

网友

2楼 · 编辑于 2024-05-29 00:04:42

它是python中的一行程序。比如：

file_data = [list(map(float,x.strip().split())) for x in open("filedata.txt","rt") if x.strip()[:1] not in "@#"]

读取文件、去除空白、消除非数据行、拆分字符串、转换为浮点。结果是一个数据对列表

网友

3楼 · 编辑于 2024-05-29 00:04:42

您可以从文件中读取行，直到到达legend行；然后对文件的余额使用read_csv。在阅读开头的行时，还可以提取xaxis和yaxis标签以用作列名。例如：

import pandas as pd
import re

with open('test.dat', 'r') as f:
    for line in f:
        m = re.search(r'xaxis\s+label\s+"([^"]+)"', line)
        if m is not None:
            xaxis = m.group(1)
        m = re.search(r'yaxis\s+label\s+"([^"]+)"', line)
        if m is not None:
            yaxis = m.group(1)
        if line.startswith('@ s0 legend'):
            break
    df = pd.read_csv(f, names=[xaxis, yaxis], delim_whitespace=True)
    f.close()
    
print(df)

输出

    Time (ps)         (K)
0         0.0  301.204895
1         1.0  299.083496
2         2.0  293.100250
3         3.0  301.090637
4         4.0  293.024811
5         5.0  297.068481
6         6.0  298.065125
7         7.0  300.354370
8         8.0  304.322693
9         9.0  297.093170
10       10.0  297.186615
11       11.0  298.112732
12       12.0  293.396545
13       13.0  295.803162
14       14.0  293.432037
15       15.0  298.306702
16       16.0  297.545715
17       17.0  294.283875
18       18.0  295.527771
19       19.0  297.193665

相关问题更多 >

编程相关推荐

热门问题

热门文章