export2graphlan是一个转换软件工具,用于为GraPhlAn生成注释和树文件

export2graphlan的Python项目详细描述


export2graphlan是一个转换软件工具,用于为GraPhlAn生成注释和树文件。特别是,注释文件试图突出显示从输入文件自动派生的特定子树,哪些节点是重要的。然后应该使用export2graphlan的两个输出文件来运行graphlan_annotate.py,以便将派生的注释附加到树上,最后,通过执行graphlan.py,用户可以获得输出图像。在

先决条件

export2graphlan需要以下附加库:

  • 熊猫队。0.13.1(pandas
  • 生物模型版本。2.0.1(biom-format,仅当您有BIOM格式的输入文件时)
  • SciPy(scipy,hclust2所需)

安装

export2graphlan在GitHub中提供:export2graphlan repository,可以使用以下方法获得:

  1. Bioconda
$ conda install export2graphlan
  1. 存储库
^{pr2}$

这将在export2graphlan子文件夹中本地下载export2graphlan存储库。然后,您必须将此子文件夹放入系统路径中,以便您可以从系统中的任何位置使用export2graphlan

$ export PATH=`pwd`/export2graphlan/:$PATH

将上述行添加到bash配置文件中将使路径添加成为永久性的。对于Windows或MacOS系统,应遵循类似的程序。在

使用

usage: export2graphlan.py [-h] [-i LEFSE_INPUT] [-o LEFSE_OUTPUT] -t TREE -a
                          ANNOTATION [--annotations ANNOTATIONS]
                          [--external_annotations EXTERNAL_ANNOTATIONS]
                          [--background_levels BACKGROUND_LEVELS]
                          [--background_clades BACKGROUND_CLADES]
                          [--background_colors BACKGROUND_COLORS]
                          [--title TITLE] [--title_font_size TITLE_FONT_SIZE]
                          [--def_clade_size DEF_CLADE_SIZE]
                          [--min_clade_size MIN_CLADE_SIZE]
                          [--max_clade_size MAX_CLADE_SIZE]
                          [--def_font_size DEF_FONT_SIZE]
                          [--min_font_size MIN_FONT_SIZE]
                          [--max_font_size MAX_FONT_SIZE]
                          [--annotation_legend_font_size ANNOTATION_LEGEND_FONT_SIZE]
                          [--abundance_threshold ABUNDANCE_THRESHOLD]
                          [--most_abundant MOST_ABUNDANT]
                          [--least_biomarkers LEAST_BIOMARKERS]
                          [--discard_otus] [--internal_levels]
                          [--biomarkers2colors BIOMARKERS2COLORS] [--sep SEP]
                          [--out_table OUT_TABLE] [--fname_row FNAME_ROW]
                          [--sname_row SNAME_ROW]
                          [--metadata_rows METADATA_ROWS]
                          [--skip_rows SKIP_ROWS] [--sperc SPERC]
                          [--fperc FPERC] [--stop STOP] [--ftop FTOP]
                          [--def_na DEF_NA]

export2graphlan.py (ver. 0.2.1 of 27 October 2018). Convert MetaPhlAn, LEfSe,
and/or HUMAnN output to GraPhlAn input format. Authors: Francesco Asnicar
(f.asnicar@unitn.it)

optional arguments:
  -h, --help            show this help message and exit
  --annotations ANNOTATIONS
                        List which levels should be annotated in the tree. Use
                        a comma separate values form, e.g.,
                        --annotation_levels 1,2,3. Default is None
  --external_annotations EXTERNAL_ANNOTATIONS
                        List which levels should use the external legend for
                        the annotation. Use a comma separate values form,
                        e.g., --annotation_levels 1,2,3. Default is None
  --background_levels BACKGROUND_LEVELS
                        List which levels should be highlight with a shaded
                        background. Use a comma separate values form, e.g.,
                        --background_levels 1,2,3. Default is None
  --background_clades BACKGROUND_CLADES
                        Specify the clades that should be highlight with a
                        shaded background. Use a comma separate values form
                        and surround the string with " if there are spaces.
                        Example: --background_clades "Bacteria.Actinobacteria,
                        Bacteria.Bacteroidetes.Bacteroidia,
                        Bacteria.Firmicutes.Clostridia.Clostridiales". Default
                        is None
  --background_colors BACKGROUND_COLORS
                        Set the color to use for the shaded background. Colors
                        can be either in RGB or HSV (using a semi-colon to
                        separate values, surrounded with ()) format. Use a
                        comma separate values form and surround the string
                        with " if it contains spaces. Example:
                        --background_colors "#29cc36, (150; 100; 100), (280;
                        80; 88)". To use a fixed set of colors associated to a
                        fixed set of clades, you can specify a mapping file in
                        a tab-separated format, where the first column is the
                        clade (using the same format as for the "--
                        background_clades" param) and the second colum is the
                        color associated. Default is None
  --title TITLE         If specified set the title of the GraPhlAn plot.
                        Surround the string with " if it contains spaces,
                        e.g., --title "Title example"
  --title_font_size TITLE_FONT_SIZE
                        Set the title font size. Default is 15
  --def_clade_size DEF_CLADE_SIZE
                        Set a default size for clades that are not found as
                        biomarkers by LEfSe. Default is 10
  --min_clade_size MIN_CLADE_SIZE
                        Set the minimum value of clades that are biomarkers.
                        Default is 20
  --max_clade_size MAX_CLADE_SIZE
                        Set the maximum value of clades that are biomarkers.
                        Default is 200
  --def_font_size DEF_FONT_SIZE
                        Set a default font size. Default is 10
  --min_font_size MIN_FONT_SIZE
                        Set the minimum font size to use. Default is 8
  --max_font_size MAX_FONT_SIZE
                        Set the maximum font size. Default is 12
  --annotation_legend_font_size ANNOTATION_LEGEND_FONT_SIZE
                        Set the font size for the annotation legend. Default
                        is 10
  --abundance_threshold ABUNDANCE_THRESHOLD
                        Set the minimun abundace value for a clade to be
                        annotated. Default is 20.0
  --most_abundant MOST_ABUNDANT
                        When only lefse_input is provided, you can specify how
                        many clades highlight. Since the biomarkers are
                        missing, they will be chosen from the most abundant.
                        Default is 10
  --least_biomarkers LEAST_BIOMARKERS
                        When only lefse_input is provided, you can specify the
                        minimum number of biomarkers to extract. The taxonomy
                        is parsed, and the level is choosen in order to have
                        at least the specified number of biomarkers. Default
                        is 3
  --discard_otus        If specified the OTU ids will be discarde from the
                        taxonmy. Default is True, i.e. keep OTUs IDs in
                        taxonomy
  --internal_levels     If specified sum-up from leaf to root the abundances
                        values. Default is False, i.e. do not sum-up
                        abundances on the internal nodes
  --biomarkers2colors BIOMARKERS2COLORS
                        Mapping file that associates biomarkers to a specific
                        color... I'll define later the specific format of this
                        file!

input parameters:
  You need to provide at least one of the two arguments

  -i LEFSE_INPUT, --lefse_input LEFSE_INPUT
                        LEfSe input data. A file that can be given to LEfSe
                        for biomarkers analysis. It can be the result of a
                        MetaPhlAn or HUMAnN analysis
  -o LEFSE_OUTPUT, --lefse_output LEFSE_OUTPUT
                        LEfSe output result data. The result of LEfSe analysis
                        performed on the lefse_input file

output parameters:
  -t TREE, --tree TREE  Output filename where save the input tree for GraPhlAn
  -a ANNOTATION, --annotation ANNOTATION
                        Output filename where save GraPhlAn annotation

Input data matrix parameters:
  --sep SEP
  --out_table OUT_TABLE
                        Write processed data matrix to file
  --fname_row FNAME_ROW
                        row number containing the names of the features
                        [default 0, specify -1 if no names are present in the
                        matrix
  --sname_row SNAME_ROW
                        column number containing the names of the samples
                        [default 0, specify -1 if no names are present in the
                        matrix
  --metadata_rows METADATA_ROWS
                        Row numbers to use as metadata[default None, meaning
                        no metadata
  --skip_rows SKIP_ROWS
                        Row numbers to skip (0-indexed, comma separated) from
                        the input file[default None, meaning no rows skipped
  --sperc SPERC         Percentile of sample value distribution for sample
                        selection
  --fperc FPERC         Percentile of feature value distribution for sample
                        selection
  --stop STOP           Number of top samples to select (ordering based on
                        percentile specified by --sperc)
  --ftop FTOP           Number of top features to select (ordering based on
                        percentile specified by --fperc)
  --def_na DEF_NA       Set the default value for missing values [default None
                        which means no replacement]

Note:最后一个输入参数(Input data matrix parameters)引用hclust2存储库中包含的DataMatrix类。在

示例

examples文件夹包含以下子文件夹:hmp_aerobiosishmp_metahit_metabolic和{}。 每个示例只需在终端窗口中键入以下命令即可工作(前提是您位于其中一个示例文件夹中):

#!bash

$ ./PIPELINE.sh

如果一切顺利,您应该在示例的同一文件夹中找到六个新文件:annot.txtoutimg.pngoutimg_annot.pngoutimg_legend.pngouttree.txt,和{}。其中:

  • annot.txt:包含由export2生成的GraPhlAn将使用的注释graphlan.py脚本
  • outimg.png:是GraPhlAn生成的循环树吗
  • outimg_annot.png:包含循环树的注释图例
  • outimg_legend.png:包含圆形树中突出显示的生物标记的图例
  • outtree.txt:带注释的树是由graphlan生成的吗_批注.py在
  • tree.txt:树是由export2生成的吗graphlan.py脚本

支持

如果您在使用export2graphlan时发现问题,请在The bioBakery help forum中报告。在

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java如何运行一个在播放歌曲的同时创建和更改UI的方法?   eclipse错误:无法找到或加载主类Java,因为类文件anme和类名不同?   两个数字相加得到一个值的java算法   java我可以更改字符串吗?   java Hibernate 5.2:以编程方式从其他jar加载映射   java如何访问随机跳转到固定位置的二进制文件   java是解析器实现中文档的功能   Javasocket的两端齐平   java查找将两个非常大的整数之和除以相等块的步骤   java如何在Restlet中调用带超时的异步HTTP客户端   java如何从servlet请求将hashmap传递给jsp。塞塔提布特   java Spring MVC HTTP状态500–内部服务器错误,Servlet。servlet[dispatcher]的init()引发异常   java即使没有alpha通道,如何将PNGFiles加载为ARGB_8888?   java将subscribe的返回类型映射到其他类型   javascript如何在安卓 WebView中启用longpress操作下载图像?   java将字符串作为hashmap值的一部分添加到StringList中   JavaSpringAOP:代表类型声明其他方法或字段   Java将二进制序列转换为字符   java使用ApachePOI获取最后一行值   为什么要在FPS(每秒帧数)跟踪器中添加时间?(爪哇)