亚基因组学工具包

mg-toolkit的Python项目详细描述


Build StatusPyPi packageDownloads

元基因组工具包使科学家能够下载所有的样本 给定研究或序列到单个csv文件的元数据。

安装元基因组工具包

pip install -U mg-toolkit

用法

$ mg-toolkit -h
usage: mg-toolkit [-h] [-V] [-d]
                  {original_metadata,sequence_search,bulk_download} ...

Metagenomics toolkit
--------------------

positional arguments:
  {original_metadata,sequence_search,bulk_download}
    original_metadata   Download original metadata.
    sequence_search     Search non-redundant protein database using HMMER
    bulk_download       Download result files in bulks for an entire study.

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         print version information
  -d, --debug           print debugging information

示例

下载元数据:

$ mg-toolkit original_metadata -a ERP001736

使用hmmer搜索非冗余蛋白质数据库并获取元数据:

$ mg-toolkit sequence_search -seq test.fasta -db full evalue -incE 0.02

Databases:
- full - Full length sequences (default)
- all - All sequences
- partial - Partial sequences

如何批量下载整个研究的结果文件?

$ mg-toolkit bulk_download -h
usage: mg-toolkit bulk_download [-h] -a ACCESSION [-o OUTPUT_PATH]
                                  [-p {1.0,2.0,3.0,4.0,4.1}]
                                  [-g {sequence_data,functional_analysis,taxonomic_analysis,taxonomic_analysis_ssu,taxonomic_analysis_lsu,stats,non_coding_rna}]

optional arguments:
  -h, --help            show this help message and exit
  -a ACCESSION, --accession ACCESSION
                        Provide the study/project accession of your interest,
                        e.g. ERP001736, SRP000319. The study must be publicly
                        available in MGnify.
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Location of the output directory, where the
                        downloadable files are written to. DEFAULT: CWD
  -p {1.0,2.0,3.0,4.0,4.1}, --pipeline {1.0,2.0,3.0,4.0,4.1}
                        Specify the version of the pipeline you are interested
                        in. Lets say your study of interest has been analysed
                        with multiple version, but you are only interested in
                        a particular version then used this option to filter
                        down the results by the version you interested in.
                        DEFAULT: Downloads all versions
  -g {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}, --result_group {sequence_data,functional_annotations,taxonomic_annotations,taxonomic_annot_ssu,taxonomic_annot_lsu,stats,non_coding_rna}
                        Provide a single result group if needed. Supported
                        result groups are: [sequence_data (all version),
                        functional_annotations (all version),
                        taxonomic_annotations (1.0-3.0), taxonomic_annot_ssu
                        (>=4.0), taxonomic_annot_lsu (>=4.0), stats,
                        non_coding_rna (>=4.0) DEFAULT: Downloads all result
                        groups if not provided. (default: None).

如何下载一个给定的学习加入的所有文件?

$ mg-toolkit -d bulk_download -a ERP009703

如何下载特定版本的研究结果?

$ mg-toolkit -d bulk_download -a ERP009703 -v 4.0

如何下载特定的结果文件组(例如,仅限功能注释)用于给定的研究登录?

$ mg-toolkit -d bulk_download -a ERP009703 -g functional_annotations

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
mongodb查询的Java查询代码   java使用参数动态创建原型对象的实例   java增加Spring MVC服务器上HTML5音频的连接超时   java可以是一个很好的工具。NET 2.0 Web服务是否在缺少SoapAction时处理来自客户端的调用?   java这会使StringBuilder的使用变得多余吗?   使用Java配置和Spring Security 3.2的安全方法注释   java为什么在Spring MVC中对http缓存控制的支持较差?   java如何将包转换为单位   java listView不会从底部填充   使用Eureka服务器AWS Elastic Beanstalk注册java Eureka客户端   java将嵌套对象从fxml映射到对象   使用反射获取java注释   服务器重启期间的java Quartz调度程序