微分极值分析软件包

blacksheep-outliers的Python项目详细描述


害群之马

微分极值分析工具

安装

使用pip

pip install blacksheep-outliers

使用conda

conda install -c bioconda blacksheep-outliers

要求

熊猫
努比
matplotlib
肖伯恩
scipy
scikit学习

使用量

在python中
importdeva# Read in datavalues_file=''#insert values file hereannotations_file=''#insert annotations file herevalues=deva.read_in_values(values_file)annotations=deva.read_in_values(annotations_file)# Binarize annotation columnsannotations=deva.binarize_annotations(annotations)# Run outliers comparative analysisoutliers,qvalues=deva.run_outliers(values,annotations,save_outlier_table=True,save_qvalues=True,save_comparison_summaries=True)# Pull out resultsqvalues_table=qvalues.dfvis_table=outliers.frac_table# Make heatmaps for significant genesforcolinannotations.columns:axs=deva.plot_heatmap(annotations,qvalues_table,col,vis_table,savefig=True)# Normalize valuesphospho=deva.read_in_values('')#Fill in file hereprotein=deva.read_in_values('')#Fill in file here
命令行界面

示例

deva binarize annotations.tsv --output_prefix annotations_test
deva outliers values.csv annotations_test.binarized.tsv --output_prefix test\
--write_outlier_table --write_comparison_summaries --write_gene_list \
--make_heatmaps

完整帮助 只需制作离群值表:

usage: deva outliers_table [-h][--output_prefix OUTPUT_PREFIX][--iqrs IQRS][--up_or_down {up,down}][--ind_sep IND_SEP][--do_not_aggregate][--write_frac_table]
                           values

Takes a table of values and converts to a table of outlier counts.

positional arguments:
  values                File path to input values. Columns must be samples,
                        genes must be sites or genes. Only .tsv and .csv
                        accepted.

optional arguments:
  -h, --help            show this help message and exit
  --output_prefix OUTPUT_PREFIX
                        Output prefix for writing files. Default outliers.
  --iqrs IQRS           Number of interquartile ranges (IQRs) above or below
                        the median to consider a value an outlier. Default is
                        1.5 IQRs.
  --up_or_down {up,down}
                        Whether to look for up or down outliers. Choices are
                        up or down. Default up.
  --ind_sep IND_SEP     If site labels have a parent molecule (e.g. a gene
                        name such as ATM) and a site identifier (e.g. S365)
                        this is the delimiter between the two elements.
                        Default is -
  --do_not_aggregate    Use flag if you do not want to sum outliers based on
                        site prefixes.
  --write_frac_table    Use flag if you want to write a table with fraction of
                        values per site, per sample that are outliers. Will
                        not be written by default. Useful for visualization.

对注释表中的列进行二值化。 **警告:不要包含非分类列或不希望二值化的列。你会 最后是一张巨大的没有挥舞的桌子。**

usage: deva binarize [-h][--output_prefix OUTPUT_PREFIX] annotations

Takes an annotation table where some columns may have more than 2 possible
values (not including empty/null values) and outputs an annotation table with
only two values per annotation. Propagates null values.

positional arguments:
  annotations           Annotation table with samples as rows and annotation
                        labels as columns.

optional arguments:
  -h, --help            show this help message and exit
  --output_prefix OUTPUT_PREFIX
                        Output prefix for writing files. Default outliers.

使用异常值计数比较注释表列中描述的所有组

usage: deva compare_groups [-h][--output_prefix OUTPUT_PREFIX][--frac_filter FRAC_FILTER][--write_comparison_summaries][--iqrs IQRS][--up_or_down {up,down}][--write_gene_list][--make_heatmaps][--fdr FDR][--red_or_blue {red,blue}][--annotation_colors ANNOTATION_COLORS]
                           outliers_table annotations

Takes an annotation table and outlier count table (output of outliers_table)
and outputs qvalues from a statistical test that looks for enrichment of
outlier values in each group in the annotation table. For each value in each
comparison, the qvalue table will have 1 column, if there are any genes in
that comparison.

positional arguments:
  outliers_table        Table of outlier counts (output of outliers_table).
                        Must be .tsv or .csv file, with outlier and non-
                        outlier counts as columns, and genes/sites as rows.
  annotations           Table of annotations. Must be .csv or .tsv. Samples as
                        rows and comparisons as columns. Comparisons must have
                        only unique values (not including missing values). If
                        there are more options than that, you can use binarize
                        to prepare the table.

optional arguments:
  -h, --help            show this help message and exit
  --output_prefix OUTPUT_PREFIX
                        Output prefix for writing files. Default outliers.
  --frac_filter FRAC_FILTER
                        The minimum fraction of samples per group that must
                        have an outlier in a gene toconsider that gene in the
                        analysis. This is used to prevent a high number of
                        outlier values in 1 sample from driving a low qvalue.
                        Default 0.3
  --write_comparison_summaries
                        Use flag to write a separate file for each column in
                        the annotations table, with outlier counts in each
                        group, p-values and q-values in each group.
  --iqrs IQRS           Number of IQRs used to define outliers in the input
                        count table. Optional.
  --up_or_down {up,down}
                        Whether input outlier table represents up or down
                        outliers. Needed for output file labels. Default up
  --write_gene_list     Use flag to write a list of significantly enriched
                        genes for each value in each comparison. If used, need
                        an fdr threshold as well.
  --make_heatmaps       Use flag to draw a heatmap of signficantly enriched
                        genes for each value in each comparison. If used, need
                        an fdr threshold as well.
  --fdr FDR             FDR cut off to use for signficantly enriched gene
                        lists and heatmaps. Default 0.05
  --red_or_blue {red,blue}
                        If --make_heatmaps is called, color of values to draw
                        on heatmap. Default red.
  --annotation_colors ANNOTATION_COLORS
                        File with color map to use for annotation header if
                        --make_heatmaps is used. Must have a 'value color'
                        format for each value in annotations. Any value not
                        represented will be assigned a new color.

使热图可视化注释表中每组的丰富基因

usage: deva visualize [-h][--output_prefix OUTPUT_PREFIX][--annotations_to_show ANNOTATIONS_TO_SHOW [ANNOTATIONS_TO_SHOW ...]][--fdr FDR][--red_or_blue {red,blue}][--annotation_colors ANNOTATION_COLORS][--write_gene_list]
                      comparison_qvalues annotations visualization_table
                      comparison_of_interest

Used to make custom heatmaps from significant genes.

positional arguments:
  comparison_qvalues    Table of qvalues, output from compare_groups. Must be
                        .csv or .tsv. Has genes/sites as rows and comparison
                        values as columns.
  annotations           Table of annotations used to generate qvalues.
  visualization_table   Values to visualize in heatmap. Samples as columns and
                        genes/sites as rows. Using outlier fraction table is
                        recommended, but original values can also be used if
                        no aggregation was used.
  comparison_of_interest
                        Name of column in qvalues table from which to
                        visualize significant genes.

optional arguments:
  -h, --help            show this help message and exit
  --output_prefix OUTPUT_PREFIX
                        Output prefix for writing files. Default outliers.
  --annotations_to_show ANNOTATIONS_TO_SHOW [ANNOTATIONS_TO_SHOW ...]
                        Names of columns from the annotation table to show in
                        the header of the heatmap. Default is all columns.
  --fdr FDR             FDR threshold to use to select genes to visualize.
                        Default 0.05
  --red_or_blue {red,blue}
                        Color of values to draw on heatmap. Default red.
  --annotation_colors ANNOTATION_COLORS
                        File with color map to use for annotation header. Must
                        have a line with 'value color' format for each value
                        in annotations. Any value not represented will be
                        assigned a new color.
  --write_gene_list     Use flag to write a list of significantly enriched
                        genes for each value in each comparison.

运行整个管道:调用异常值,对注释表中的所有组执行比较 ,可以为每个组制作热图。

usage: deva outliers [-h][--output_prefix OUTPUT_PREFIX][--iqrs IQRS][--up_or_down {up,down}][--do_not_aggregate][--write_outlier_table][--write_frac_table][--ind_sep IND_SEP][--frac_filter FRAC_FILTER][--write_comparison_summaries][--fdr FDR][--write_gene_list][--make_heatmaps][--red_or_blue {red,blue}][--annotation_colors ANNOTATION_COLORS]
                     values annotations

Runs whole outliers pipeline. Has options to output every possible output.

positional arguments:
  values                File path to input values. Samples are columns and
                        genes/sites are rows. Only .tsv and .csv accepted.
  annotations           File path to annotation values. Rows are sample names,
                        header is different annotations. e.g. mutation status.

optional arguments:
  -h, --help            show this help message and exit
  --output_prefix OUTPUT_PREFIX
                        Output prefix for writing files. Default outliers.
  --iqrs IQRS           Number of inter-quartile ranges (IQRs) above or below
                        the median to consider a value an outlier. Default is
                        1.5.
  --up_or_down {up,down}
                        Whether to look for up or down outliers. Choices are
                        up or down. Default up.
  --do_not_aggregate    Use flag if you do not want to sum outliers based on
                        site prefixes.
  --write_outlier_table
                        Use flag to write a table of outlier counts.
  --write_frac_table    Use flag if you want to write a table with fraction of
                        values per site per sample that are outliers. Useful
                        for custom visualization.
  --ind_sep IND_SEP     If site labels have a parent molecule (e.g. a gene
                        name such as ATM) and a site identifier (e.g. S365)
                        this is the delimiter between the two elements.
                        Default is -
  --frac_filter FRAC_FILTER
                        The minimum fraction of samples per group that must
                        have an outlier in a gene toconsider that gene in the
                        analysis. This is used to prevent a high number of
                        outlier values in 1 sample from driving a low qvalue.
                        Default 0.3
  --write_comparison_summaries
                        Use flag to write a separate file for each column in
                        the annotations table, with outlier counts in each
                        group, p-values and q-values in each group.
  --fdr FDR             FDR threshold to use to select genes to visualize.
                        Default 0.05
  --write_gene_list     Use flag to write a list of significantly enriched
                        genes for each value in each comparison.
  --make_heatmaps       Use flag to draw a heatmap of significantly enriched
                        genes for each value in each comparison. If used, need
                        an fdr threshold as well.
  --red_or_blue {red,blue}
                        Color of values to draw on heatmap. Default red.
  --annotation_colors ANNOTATION_COLORS
                        File with color map to use for annotation header. Must
                        have a line with 'value color' format for each value
                        in annotations. Any value not represented will be
                        assigned a new color.

用于查找不同数据级别无法解释的值差异。例如 ,这可用于找出差异磷酸化(磷酸化)引起的变化 目标值)不是由于蛋白质丰度的变化(蛋白质作为标准值)。
警告:两个表之间的行ID必须匹配

usage: deva normalize [-h][--ind_sep IND_SEP][--output_prefix OUTPUT_PREFIX]
                      target_values normalizer_values

Takes a target table and a normalizer table, and returns a normalized target
table. Builds a regularized linear model for each line in the target table
using the matching row ID in the normalizer table, and finds the residuals of
that model for each value. for example, this could be used to normalize
phospho-peptide data by protein abundance data; resulting values will reflect
only abundance differences due to phosphorylation changes, not peptide
abundances. Another use could be normalizing RNA by CNA.

positional arguments:
  target_values         Table of values to be normalized. Sites/genes as rows,
                        samples as columns. Row identifiers must be unique.
  normalizer_values     Table of values to use for normalization. Sites/genes
                        as rows, samples as columns. Row identifiers must be
                        unique, and must match the pre-ind_sep part of the
                        target values identifiers.

optional arguments:
  -h, --help            show this help message and exit
  --ind_sep IND_SEP     Separator used in index if target is site specific.
                        Row IDs before ind_sep in the target must match the
                        row IDs in normalizer_values. If row IDs already
                        match, leave blank.
  --output_prefix OUTPUT_PREFIX
                        Prefix for output file. Suffix will be
                        '.normalized.tsv'

有关更详细的小插曲,请参阅我们的[补充笔记本](https://github.com /橄榄球板/黑羊俱乐部供应)

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java创建猜谜游戏程序   JavaWebSocketContainer。connectToServer似乎挂起了   如何在java中中断函数   java c#socket client multiple BeginSend()未到达服务器   不可见的组件然后在Java中的窗格之间切换   java在应用程序类中使用静态接口安全吗?   java等待函数完成,直到回调到来   使用DataOutputStream时的java新行,Android   java服务对象的定义是什么?   基于视图的javahibernate复合密钥   java将varchar连接到char在JPA(oracle)中不起作用   如何在java中通过point类读取多个点?