Python ensembl-rest包_程序模块 - PyPI

Ensembl Rest API的接口，指尖的生物数据。

ensembl-rest的Python项目详细描述

ENSEMBL休息

https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg

与ensemblrestapi的python接口。全世界的生物数据在你的指尖。

Ensembl database包含参考几乎所有生物体的生物数据。现在很容易通过rest api以编程方式访问这些数据。

Ensembl Rest API终结点的完整列表，以及端点特定的文档可以在their website上找到。

这个库还包括一些构建在api之上的实用程序，这些api旨在轻松使用它们，包括AssemblyMapper类这有助于不同基因组组合之间的转换。

这个项目使用RESTEasy中的代码，这让我的生活轻松多了。谢谢！

安装

您可以从PyPI：

安装

$ pip install ensembl_rest

示例

库导出指向 API，例如：

>>>importensembl_rest>>>ensembl_rest.symbol_lookup(species='homo sapiens',symbol='BRCA2')

{ 'species': 'human',
  'object_type': 'Gene',
  'description': 'BRCA2, DNA repair associated [Source:HGNC Symbol;Acc:HGNC:1101]',
  'assembly_name': 'GRCh38',
  'end': 32400266,
  ...
  ...
  ...
  'seq_region_name': '13',
  'strand': 1,
  'id': 'ENSG00000139618',
  'start': 32315474}

所有端点都列在API website上。可以通过调用模块上的“帮助”来快速查找方法：

>>>help(ensembl_rest)

如果要使用API website中登记的端点，请说GET lookup/symbol/:species/:symbol，相应方法的名称在端点文档url中，在本例中，文档链接到 http://rest.ensembl.org/documentation/info/symbol_lookup所以对应的方法名是symbol_lookup。

>>>help(ensembl_rest.symbol_lookup)

Help on function symbol_lookup in module ensembl_rest:

symbol_lookup(*args, **kwargs)
        Lookup ``GET lookup/symbol/:species/:symbol``

    Find the species and database for a symbol in a linked external database


    **Parameters**

    - Required:
            + **Name**:  species
            + *Type*:  String
            + *Description*:  Species name/alias
            + *Default*:  -
            + *Example Values*:  homo_sapiens, human
    ...
    ...

    - Optional:

            + **Name**:  expand
            + *Type*:  Boolean(0,1)
            + *Description*:  Expands the search to include any connected features. e.g. If the object is a gene, its transcripts, translations and exons will be returned as well.
    ...
    ...

    **Resource info**

    - **Methods**:  GET
    - **Response formats**: json, xml, jsonp


    **More info**

    https://rest.ensembl.org/documentation/info/symbol_lookup

我们可以从资源字符串GET lookup/symbol/:species/:symbol中看到此方法包含两个名为“种类”和“符号”的参数，因此我们可以调用方法如下：

>>>ensembl_rest.symbol_lookup(species='homo sapiens',symbol='TP53')# Or like this...>>>ensembl_rest.symbol_lookup('homo sapiens','TP53')

{'source': 'ensembl_havana',
  'object_type': 'Gene',
  'logic_name': 'ensembl_havana_gene',
 ...
 ...
 ...
  'start': 32315474}

可以使用^{tt4}提供可选参数$ 关键字（要传递的特定参数取决于特定端点，可以找到正式的端点文档here）

# Fetch also exons, transcripts, etc...>>>ensembl_rest.symbol_lookup('human','BRCA2',params={'expand':True})

{'source': 'ensembl_havana',
 'seq_region_name': '13',
 'Transcript': [{'source': 'ensembl_havana',
   'object_type': 'Transcript',
   'logic_name': 'ensembl_havana_transcript',
   'Exon': [{'object_type': 'Exon',
     'version': 4,
     'species': 'human',
     'assembly_name': 'GRCh38',
     ...
     ...
     ...
 'biotype': 'protein_coding',
 'start': 32315474}

post端点的参数也通过^{tt4}提供$ 关键字，例如在下一个示例中：

>>>ensembl_rest.symbol_post(species='human',params={'symbols':["BRCA2","TP53","BRAF"]})

{
    "BRCA2": {
        "source": "ensembl_havana",
        "object_type": "Gene",
        "logic_name": "ensembl_havana_gene",
        "description": "BRCA2, DNA repair associated [Source:HGNC Symbol;Acc:HGNC:1101]",
        ...
        ...
    },
    "TP53": {
        ...
        ...
    }.
    "BRAF": {
        ...
        ...
        "strand": -1,
        "id": "ENSG00000157764",
        "start": 140719327
    }
}

另一个常见的用法是获取已知基因的序列：

>>>ensembl_rest.sequence_id('ENSG00000157764')

{'desc': 'chromosome:GRCh38:7:140719327:140924928:-1',
 'query': 'ENSG00000157764',
 'version': 13,
 'id': 'ENSG00000157764',
 'seq': 'TTCCCCCAATCCCCTCAGGCTCGG...ATTGACTGCATGGAGAAGTCTTCA',
 'molecule': 'dna'}

如果您希望它在fasta中，可以修改headers：

>>>ensembl_rest.sequence_id('ENSG00000157764',headers={'content-type':'text/x-fasta'})

>ENSG00000157764.13 chromosome:GRCh38:7:140719327:140924928:-1
TTCCCCCAATCCCCTCAGGCTCGGCTGCGCCCGGGGCCGCGGGCCGGTACCTGAGGTGGC
CCAGGCGCCCTCCGCCCGCGGCGCCGCCCGGGCCGCTCCTCCCCGCGCCCCCCGCGCCCC
CCGCTCCTCCGCCTCCGCCTCCGCCTCCGCCTCCCCCAGCTCTCCGCCTCCCTTCCCCCT
...

注意，如果保持不变，方法会在字典（json）中请求数据。使它们易于使用的格式。如果响应不能这样解码，然后将其作为纯文本返回，如上文所述。

您还可以在程序集之间映射…

>>>ensembl_rest.assembly_map(species='human',asm_one='GRCh37',region='X:1000000..1000100:1',asm_two='GRCh38')# Or...>>>region_str=ensembl_rest.region_str(chrom='X',start=1000000,end=1000100)>>>ensembl_rest.assembly_map(species='human',asm_one='GRCh37',region=region_str,asm_two='GRCh38')

{'mappings': [{'original': {'seq_region_name': 'X',
    'strand': 1,
    'coord_system': 'chromosome',
    'end': 1000100,
    'start': 1000000,
    'assembly': 'GRCh37'},
   'mapped': {'seq_region_name': 'X',
    'strand': 1,
    'coord_system': 'chromosome',
    'end': 1039365,
    'start': 1039265,
    'assembly': 'GRCh38'}}]}

上述问题（从一个程序集映射到另一个程序集）非常频繁，以至于库提供了一个专门的类AssemblyMapper来有效地在程序集之间映射大量区域。这个类避免每次需要映射时发出web请求的耗时任务从实例化中获取整个程序集的映射。这个本身就是一个耗时的操作，但如果必须这样做，它会有回报的在程序集之间重复转换。：

>>> mapper = ensembl_rest.AssemblyMapper(
                species='human',
                from_assembly='GRCh37',
                to_assembly='GRCh38'
            )

>>> mapper.map(chrom='1', pos=1000000)
1064620

你还可以找到正畸，副记录和基因树信息，以及变化数据和基本的一切Ensembl 必须提供。

如果要实例化自己的客户机，可以使用 ensembl_rest.EnsemblClient类，这个类包含所有端点方法。

>>>client=ensembl_rest.EnsemblClient()>>>client.symbol_lookup('homo sapiens','TP53')

{'source': 'ensembl_havana',
  'object_type': 'Gene',
  'logic_name': 'ensembl_havana_gene',
  'version': 14,
  'species': 'human',
  ...
  ...
  ...}

最后，库公开类ensembl_rest.HTTPError，该类允许处理请求中的错误。当使用 GET genetree/member/symbol/:species/:symbol查询基因树的端点为了找到同源和副同源的蛋白质和基因。此终结点返回当代码为400且错误消息为 Lookup found nothing。我们可以利用这些信息来检测错误处理它，或者如果我们希望这样做就忽略它：

forgenein['TP53','rare-new-gene','BRCA2']:try:gene_tree=ensembl_rest.genetree_member_symbol(species='human',symbol=gene,params={'prune_species':'human'})# Assuming we have a function to extract the paralogsparalogs=extract_paralogs(gene_tree['tree'])print(paralogs)# Handle the case when there's no gene treeexceptensembl_rest.HTTPErroraserr:error_code=err.response.status_codeerror_message=err.response.json()['error']if(error_code==400) \
           and('Lookup found nothing'inerror_message):# Skip the gene with no datapasselse:# The exception was caused by another problem# Raise the exception againraise

元

作者：Ad115- Github–a.garcia230395@gmail.com

项目页： Docs-@GitHub-@PyPI

根据麻省理工学院的许可证发行。见 LICENSE 更多信息。

贡献

检查是否有未解决的问题或写一个新问题开始讨论围绕一个功能想法或一个bug。
叉子the repository 在github上开始对功能分支进行更改，派生来自master分支。
编写一个测试，显示错误已被修复或功能按预期工作。
发送一个pull请求并对维护程序进行bug操作，直到它被合并并出版。

欢迎加入QQ群-->： 979659372

ensembl-rest 0.3.3

ensembl-rest的Python项目详细描述

ENSEMBL休息

安装

示例

元

贡献

推荐PyPI第三方库

ddtrace-graphql

Flask-Zen

odoo11-addon-website-sale-secondary-unit

juju-scalewa

CloeePy-RabbitMQ

minip

pyliburo

pyhdfview

browserplus

latua

yanlp

starlette-skin

atd-args-util

qtip

aiotext

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

ensembl-rest 0.3.3

ensembl-rest的Python项目详细描述

ENSEMBL休息

安装

示例

元

贡献

推荐PyPI第三方库

ddtrace-graphql

Flask-Zen

odoo11-addon-website-sale-secondary-unit

juju-scalewa

CloeePy-RabbitMQ

minip

pyliburo

pyhdfview

browserplus

latua

yanlp

starlette-skin

atd-args-util

qtip

aiotext

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签