如何用具有特定图案的指定线条替换线条?

2024-03-29 09:07:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件如下:

geo.txt

Receptor Name:I151T.B99990002_mus.pdbqt
Liang Name: LIGAND 1
Using random seed: 1896818552

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
     1        -10.7      0.000      0.000
     2        -10.4      1.859      3.037
     3        -10.1      1.992      3.474

Receptor Name: I151T.B99990001_mus.pdbqt
Liang Name: LIGAND 1
Using random seed: 1896818552

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
     1         -9.5      0.000      0.000
     2         -9.4      2.083      3.784
     3         -9.0      2.471      8.360
     4         -9.0      1.493      3.523

在上面的文件中,我想提取每个具有模式(Receptor Name:)的行,并将这些行与其对应的值相加。你知道吗

我尝试过:

import  os
import re

h=open("/Users/geoList.txt","r")
for i in h:
    if re.match(r'\s\s\s\d+', i) or i.startswith("Receptor Name:"):
        print i.replace("\n","")

我得到如下输出:

Receptor Name: I151T.B99990002_mus.pdbqt
   1        -10.7      0.000      0.000
   2        -10.4      1.859      3.037
   3        -10.1      1.992      3.474

Receptor Name:I151T.B99990001_mus.pdbqt
   1         -9.5      0.000      0.000
   2         -9.4      2.083      3.784
   3         -9.0      2.471      8.360
   4         -9.0      1.493      3.523

但是在这里,我不知道如何将具有(Receptor Name:)的线与其各自的值连接起来。你知道吗

例如:预期的输出文件应如下所示:

FIRST PATTERN MATCH: with corresponding values:
-----------------------------------------------

Receptor Name:I151T.B99990002_mus.pdbqt 1        -10.7      0.000      0.000

Receptor Name:I151T.B99990002_mus.pdbqt 2        -10.4      1.859      3.037

Receptor Name:I151T.B99990002_mus.pdbqt 3        -10.1      1.992      3.474

SECOND PATTERN MATCH: with corresponding values
-----------------------------------------------

Receptor Name: I151T.B99990001_mus.pdbqt 1         -9.5      0.000      0.000

Receptor Name: I151T.B99990001_mus.pdbqt 2         -9.4      2.083      3.784

Receptor Name: I151T.B99990001_mus.pdbqt 3         -9.0      2.471      8.360

Receptor Name: I151T.B99990001_mus.pdbqt 4         -9.0      1.493      3.523

先谢谢


Tags: nametxtmoderandomligandseedusingrmsd
2条回答

您可以这样做(您只需要存储带有“Receptor”的行):

>>> for line in h:
...     if line.startswith('Receptor Name:'):
...         prefix = line
...     elif re.search(r'^\s+\d', line):
...         print prefix + ' ' + line.strip()

您完全可以使用正则表达式来实现这一点:

/(^Receptor Name:[^\n]*)(?:.*?^[-+]+)(.*?)(?=^Receptor Name:|\Z)/\1\2/gms

Demo

然后可以很容易地将其转换为Python逻辑来执行您想要的操作:

txt='''\
Receptor Name:I151T.B99990002_mus.pdbqt
Liang Name: LIGAND 1
Using random seed: 1896818552

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
  -+      +     +     
     1        -10.7      0.000      0.000
     2        -10.4      1.859      3.037
     3        -10.1      1.992      3.474

Receptor Name: I151T.B99990001_mus.pdbqt
Liang Name: LIGAND 1
Using random seed: 1896818552

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
  -+      +     +     
     1         -9.5      0.000      0.000
     2         -9.4      2.083      3.784
     3         -9.0      2.471      8.360
     4         -9.0      1.493      3.523'''

import re
pat=re.compile(r'(^Receptor Name:[^\n]*)(?:.*?^[-+]+)(.*?)(?=^Receptor Name:|\Z)', flags=re.S | re.M)   

for m in pat.finditer(txt):
    for line in m.group(2).splitlines():
        line=line.strip()
        if line:
             print m.group(1), line

印刷品:

Receptor Name:I151T.B99990002_mus.pdbqt 1        -10.7      0.000      0.000
Receptor Name:I151T.B99990002_mus.pdbqt 2        -10.4      1.859      3.037
Receptor Name:I151T.B99990002_mus.pdbqt 3        -10.1      1.992      3.474
Receptor Name: I151T.B99990001_mus.pdbqt 1         -9.5      0.000      0.000
Receptor Name: I151T.B99990001_mus.pdbqt 2         -9.4      2.083      3.784
Receptor Name: I151T.B99990001_mus.pdbqt 3         -9.0      2.471      8.360
Receptor Name: I151T.B99990001_mus.pdbqt 4         -9.0      1.493      3.523

相关问题 更多 >