import biotite.database.rcsb as rcsb
import biotite.structure as struc
import biotite.structure.io.pdbx as pdbx
ID = "1BRS"
# Download structure
file_name = rcsb.fetch(ID, "pdbx", target_path=".")
# Read file
file = pdbx.PDBxFile()
file.read(file_name)
# Get 'pdbx_struct_assembly_gen' category as dictionary
assembly_dict = file["pdbx_struct_assembly_gen"]
for asym_id_list in assembly_dict["asym_id_list"]:
chain_ids = asym_id_list.split(",")
print(f"{ID}_{':'.join(chain_ids)}")
输出为
1BRS_A:D:G:J
1BRS_B:E:H:K
1BRS_C:F:I:L
G-L链只含有水分子。你知道吗
编辑:
要仅包括属于聚合物(例如蛋白质或核苷酸)的链ID,可以使用entity_poly类别:
loop_
_entity_poly.entity_id
_entity_poly.type
_entity_poly.nstd_linkage
_entity_poly.nstd_monomer
_entity_poly.pdbx_seq_one_letter_code
_entity_poly.pdbx_seq_one_letter_code_can
_entity_poly.pdbx_strand_id
_entity_poly.pdbx_target_identifier
1 'polypeptide(L)' no no
;AQVINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGKLPGKSGRTWREADINYTS
GFRNSDRILYSSDWLIYKTTDHYQTFTKIR
;
;AQVINTFDGVADYLQTYHKLPDNYITKSEAQALGWVASKGNLADVAPGKSIGGDIFSNREGKLPGKSGRTWREADINYTS
GFRNSDRILYSSDWLIYKTTDHYQTFTKIR
;
A,B,C ?
2 'polypeptide(L)' no no
;KKAVINGEQIRSISDLHQTLKKELALPEYYGENLDALWDALTGWVEYPLVLEWRQFEQSKQLTENGAESVLQVFREAKAE
GADITIILS
;
;KKAVINGEQIRSISDLHQTLKKELALPEYYGENLDALWDALTGWVEYPLVLEWRQFEQSKQLTENGAESVLQVFREAKAE
GADITIILS
;
D,E,F ?
这是更新的Python代码:
import biotite.database.rcsb as rcsb
import biotite.structure as struc
import biotite.structure.io.pdbx as pdbx
ID = "1BRS"
# Download structure
file_name = rcsb.fetch(ID, "pdbx", target_path=".")
# Read file
file = pdbx.PDBxFile()
file.read(file_name)
# Get 'entity_poly' category as dictionary
# to find out which chains are polymers
poly_chains = []
for chain_list in file["entity_poly"]["pdbx_strand_id"]:
poly_chains += chain_list.split(",")
# Get 'pdbx_struct_assembly_gen' category as dictionary
for asym_id_list in file["pdbx_struct_assembly_gen"]["asym_id_list"]:
chain_ids = asym_id_list.split(",")
# Filter chains that belong to a polymer
chain_ids = [chain_id for chain_id in chain_ids if chain_id in poly_chains]
print(f"{ID}_{':'.join(chain_ids)}")
PDBx/mmCIF文件格式包含
_pdbx_struct_assembly_gen
类别中的信息。你知道吗这些文件可以通过我正在开发的包来读取,例如用黑云母(https://www.biotite-python.org/)。 这些类别可以用类似字典的方式阅读:
输出为
G-L链只含有水分子。你知道吗
编辑:
要仅包括属于聚合物(例如蛋白质或核苷酸)的链ID,可以使用
entity_poly
类别:这是更新的Python代码:
这是输出:
相关问题 更多 >
编程相关推荐