使用登录号从ENA数据库的ftp服务器下载数据

2条回答

网友

1楼 · 编辑于 2024-05-16 00:12:30

我找到了解决我主观问题的方法：

curl -F accessions=@<input_file_path> http://www.ebi.ac.uk/ena/data/download?
display=<output_format> 
-o<output_file_name>

网友

2楼 · 编辑于 2024-05-16 00:12:30

我觉得我的这个旧剧本可能有用。在

有一点需要注意的是，基因组是从NCBI而不是ENA下载的，但我认为这些数据库中的很多都是相互同步的。所以你还是可以找到你想要的。在

如果您只想从给定的登录号（~2500）下载这些基因组，那么这个可能不起作用（除非您可能在下载之前对返回的search_results进行过滤；Entrez.efetch）。在

#!/usr/bin/env python

from Bio import Entrez

search_term = raw_input("Organism name: ")

Entrez.email = "your_email@isp.com"   # required by NCBI
search_handle = Entrez.esearch(db="nucleotide", term=search_term, usehistory="y", property='complete genome')
search_results = Entrez.read(search_handle)
search_handle.close()

gi_list = search_results["IdList"]
count = int(search_results["Count"])
webenv = search_results["WebEnv"]
query_key = search_results["QueryKey"]

batch_size = 5    # download sequences in batches so NCBI doesn't time you out

with open("ALL_SEQ.fasta", "w") as out_handle:
    for start in range(0, count, batch_size):
        end = min(count, start+batch_size)
        print "Going to download record %i to %i" % (start+1, end)
        fetch_handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text",retstart=start, retmax=batch_size, webenv=webenv, query_key=query_key)
        data = fetch_handle.read()
        fetch_handle.close()
        out_handle.write(data)

print ("\nDownload completed")

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用登录号从ENA数据库的ftp服务器下载数据

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >