mygene.info服务的python客户端。
mygene的Python项目详细描述
简介
MyGene.Info提供简单易用的rest web服务来查询/检索基因注释数据。 它设计简单,性能突出。mygene,是一种易于使用的python 访问MyGene.Info服务的包装。
从v3.1.0开始,mygenepython包现在是底层biothings_client包的薄包装, 所有BioThings APIs的通用python客户端,包括MyGene.info。 安装mygene将自动安装biothings_client。以下代码片段 本质上是等价的:
继续使用mygene包
In[1]:importmygeneIn[2]:mg=mygene.MyGeneInfo()
直接使用biothings_client包
In[1]:frombiothings_clientimportget_clientIn[2]:mg=get_client('gene')
之后,mginstance的用法完全相同,例如下面的用法示例。
要求
python >=2.7 (including python3)
(Python 2.6 might still work, not it’s not supported any more since v3.1.0.)
biothings_client (>=0.2.0, install using “pip install biothings_client”)
可选依赖项
pandas (install using “pip install pandas”) is required for returning a list of gene objects as DataFrame.
安装
- Option 1
- pip install mygene
- Option 2
download/extract the source code and run:
python setup.py install- Option 3
install the latest code directly from the repository:
pip install -e git+https://github.com/biothings/mygene.py#egg=mygene
版本历史记录
CHANGES.txt
用法
In[1]:importmygeneIn[2]:mg=mygene.MyGeneInfo()In[3]:mg.getgene(1017)Out[3]:{'_id':'1017','entrezgene':1017,'name':'cyclin-dependent kinase 2','symbol':'CDK2','taxid':9606}In[4]:mg.getgene(1017,'name,symbol,refseq')Out[4]:{'_id':'1017','name':'cyclin-dependent kinase 2','refseq':{'genomic':['AC_000144.1','NC_000012.11','NG_028086.1','NT_029419.12','NW_001838059.1'],'protein':['NP_001789.2','NP_439892.2'],'rna':['NM_001798.3','NM_052827.2']},'symbol':'CDK2'}In[5]:mg.getgene(1017,'name,symbol,refseq.rna')Out[5]:{'_id':'1017','name':'cyclin-dependent kinase 2','refseq':{'rna':['NM_001798','NM_052827']},'symbol':'CDK2'}In[6]:mg.getgenes([1017,1018,'ENSG00000148795'])Out[6]:[{'_id':'1017','entrezgene':1017,'name':'cyclin-dependent kinase 2','query':'1017','symbol':'CDK2','taxid':9606},{'_id':'1018','entrezgene':1018,'name':'cyclin-dependent kinase 3','query':'1018','symbol':'CDK3','taxid':9606},{'_id':'1586','entrezgene':1586,'name':'cytochrome P450, family 17, subfamily A, polypeptide 1','query':'ENSG00000148795','symbol':'CYP17A1','taxid':9606}]In[7]:mg.getgenes([1017,1018,'ENSG00000148795'],as_dataframe=True)Out[7]:_identrezgene \ query101710171017101810181018ENSG0000014879515861586namesymbol \ query1017cyclin-dependentkinase2CDK21018cyclin-dependentkinase3CDK3ENSG00000148795cytochromeP450,family17,subfamilyA,polyp...CYP17A1taxidquery1017960610189606ENSG000001487959606[3rowsx5columns]In[8]:mg.query('cdk2',size=5)Out[8]:{'hits':[{'_id':'1017','_score':373.24667,'entrezgene':1017,'name':'cyclin-dependent kinase 2','symbol':'CDK2','taxid':9606},{'_id':'12566','_score':353.90176,'entrezgene':12566,'name':'cyclin-dependent kinase 2','symbol':'Cdk2','taxid':10090},{'_id':'362817','_score':264.88477,'entrezgene':362817,'name':'cyclin dependent kinase 2','symbol':'Cdk2','taxid':10116},{'_id':'52004','_score':21.221401,'entrezgene':52004,'name':'CDK2-associated protein 2','symbol':'Cdk2ap2','taxid':10090},{'_id':'143384','_score':18.617256,'entrezgene':143384,'name':'CDK2-associated, cullin domain 1','symbol':'CACUL1','taxid':9606}],'max_score':373.24667,'took':10,'total':28}In[9]:mg.query('reporter:1000_at')Out[9]:{'hits':[{'_id':'5595','_score':11.163337,'entrezgene':5595,'name':'mitogen-activated protein kinase 3','symbol':'MAPK3','taxid':9606}],'max_score':11.163337,'took':6,'total':1}In[10]:mg.query('symbol:cdk2',species='human')Out[10]:{'hits':[{'_id':'1017','_score':84.17707,'entrezgene':1017,'name':'cyclin-dependent kinase 2','symbol':'CDK2','taxid':9606}],'max_score':84.17707,'took':27,'total':1}In[11]:mg.querymany([1017,'695'],scopes='entrezgene',species='human')Finished.Out[11]:[{'_id':'1017','entrezgene':1017,'name':'cyclin-dependent kinase 2','query':'1017','symbol':'CDK2','taxid':9606},{'_id':'695','entrezgene':695,'name':'Bruton agammaglobulinemia tyrosine kinase','query':'695','symbol':'BTK','taxid':9606}]In[12]:mg.querymany([1017,'695'],scopes='entrezgene',species=9606)Finished.Out[12]:[{'_id':'1017','entrezgene':1017,'name':'cyclin-dependent kinase 2','query':'1017','symbol':'CDK2','taxid':9606},{'_id':'695','entrezgene':695,'name':'Bruton agammaglobulinemia tyrosine kinase','query':'695','symbol':'BTK','taxid':9606}]In[13]:mg.querymany([1017,'695'],scopes='entrezgene',species=9606,as_dataframe=True)Finished.Out[13]:_identrezgenenamesymbol \ query101710171017cyclin-dependentkinase2CDK2695695695BrutonagammaglobulinemiatyrosinekinaseBTKtaxidquery101796066959606[2rowsx5columns]In[14]:mg.querymany([1017,'695','NA_TEST'],scopes='entrezgene',species='human')Finished.Out[14]:[{'_id':'1017','entrezgene':1017,'name':'cyclin-dependent kinase 2','query':'1017','symbol':'CDK2','taxid':9606},{'_id':'695','entrezgene':695,'name':'Bruton agammaglobulinemia tyrosine kinase','query':'695','symbol':'BTK','taxid':9606},{'notfound':True,'query':'NA_TEST'}]# query all human kinases using fetch_all parameter:In[15]:kinases=mg.query('name:kinase',species='human',fetch_all=True)In[16]:kinasesOut[16]" <generator object _fetch_all at 0x7fec027d2eb0># kinases is a Python generator, now you can loop through it to get all 1073 hits:In[16]:forgeneinkinases:....:printgene['_id'],gene['symbol']Out[16]:<outputomittedhere>
联系人
- 向我们提出任何问题或反馈:
- biothings@googlegroups.com(公开讨论)
- help@mygene.info(私下联系开发人员)
- Github issues
- 在twitter上@mygeneinfo
- 在标签为mygene的BioStars.org上发布一个问题。