计算原子坐标之间的距离

ATOM 920 CA GLN A 203 39.292 -13.354 17.416 1.00 55.76 C ATOM 929 CA HIS A 204 38.546 -15.963 14.792 1.00 29.53 C ATOM 939 CA ASN A 205 39.443 -17.018 11.206 1.00 54.49 C ATOM 947 CA GLU A 206 41.454 -13.901 10.155 1.00 26.32 C ATOM 956 CA VAL A 207 43.664 -14.041 13.279 1.00 40.65 C . . . ATOM 963 CA GLU A 208 45.403 -17.443 13.188 1.00 40.25 C

3条回答

网友

1楼 · 编辑于 2024-05-16 03:32:30

不要在空白处拆分

这里给出的其他答案提出了一个错误的假设——坐标将以空格分隔。根据PDB specification of ^{}，这是而不是必要的情况：PDB记录值是由列索引指定的，并且可以相互流动。例如，第一个ATOM记录如下：

ATOM    920  CA  GLN A 203      39.292 -13.354  17.416  1.00 55.76           C

但这也是完全正确的：

ATOM    920  CA  GLN A 203      39.292-13.3540  17.416  1.00 55.76           C

更好的方法

由于列指定的索引以及PDB文件中可能出现的其他问题的数量，您不应该编写自己的解析器。PDB格式很混乱，需要处理很多特殊情况和格式错误的文件。相反，使用已经为您编写的解析器。

我喜欢Biopython的PDB.PDBParser。它将把结构解析为Python对象，并提供方便的特性。如果您更喜欢Perl，请查看BioPerl。

PDB.Residue对象允许按名称对原子进行键控访问，并且PDB.Atom对象重载-运算符以返回两个原子之间的距离。我们可以使用它来编写简洁明了的代码：

代码

from Bio import PDB
parser = PDB.PDBParser()

# Parse the structure into a PDB.Structure object
pdb_code = "1exm"
pdb_path = "pdb1exm.ent"
struct = parser.get_structure(pdb_code, pdb_path)

# Grab the first two residues from the structure
residues = struct.get_residues()
res_one = residues.next()
res_two = residues.next()

try:
    alpha_dist = res_one['CA'] - res_two['CA']
except KeyError:
    print "Alpha carbon missing, computing distance impossible!"

网友

2楼 · 编辑于 2024-05-16 03:32:30

假设您的数据在“atoms.txt”中，这将逐行读取并将条目拆分为一个列表：

import csv

with open("atoms.txt") as f:
  reader = csv.reader(f)
  for line, in reader:
      entries = line.split()
      print entries

现在为每个列表提取所需的列，并计算距离等（记住python中的列表是基于零的）。

网友

3楼 · 编辑于 2024-05-16 03:32:30

如果数据用空白分隔，那么一个简单的split就可以完成这项工作。对行进行缓冲以按顺序进行比较。

use strict;
use warnings;

my @line;
while (<>) {
    push @line, $_;            # add line to buffer
    next if @line < 2;         # skip unless buffer is full
    print proc(@line), "\n";   # process and print 
    shift @line;               # remove used line 
}

sub proc {
    my @a = split ' ', shift;   # line 1
    my @b = split ' ', shift;   # line 2
    my $x = ($a[6]-$b[6]);      # calculate the diffs
    my $y = ($a[7]-$b[7]);
    my $z = ($a[8]-$b[8]);
    my $dist = sprintf "%.1f",                # format the number
                   sqrt($x**2+$y**2+$z**2);   # do the calculation
    return "$a[3]-$b[3]\t$dist"; # return the string for printing
}

输出（带有示例数据）：

GLN-HIS 3.8
HIS-ASN 3.8
ASN-GLU 3.9
GLU-VAL 3.8

如果数据是制表符分隔的，则可以在/\t/而不是' '上拆分。

不要在空白处拆分

更好的方法

代码

相关问题更多 >

编程相关推荐

热门问题

热门文章