Python baseqDrops包_程序模块 - PyPI

处理drop seq、10x（3prime）和indrop rna seq数据集

baseqDrops的Python项目详细描述

#BaseQdrops
一个用于处理10x、indrop和drop seq数据集的通用管道。

` BaseQdrops`

建议计算机或服务器具有内存>；=30GB和CPU核>；=8以实现高效处理；

基因组；
+`samtools`：用于排序对齐的BAM文件（version>；=1.6）；
+`whitelistdir`：indrop和10x的条形码白名单文件应放在whitelistdir下。这些文件可以从https://github.com/beiseq/baseqdrops/tree/master/whitelist；
+`cellranger\u ref<；genome>；下载；`阅读对齐和标记基因的关键过程是从开源cellranger管道（https://github.com/10xgenomics/cellranger）中得到启发和借鉴的。基因组索引和转录组的引用可以从https://support.10xgenomics.com/single cell gene expression/software/downloads/latest下载。
在配置文件中，cellranger引用的目录名为"cellranger\lt；genome>；"。

配置记录在名为"config_drops.ini"的文件中：

[drops]
samtools=/path/to/samtools
star=/path/to/star
whitelistdir=/path/to/whitelist_file_directory
celllanger_ref_hg38=/路径/to/reference/refdata-cellranger-grch38-1.2.0/

`单元条码计数：对数据集中已有的条码进行计数。这将生成一个名为：barcode_u count_lt；sample>；.csv；
2的文件。`单元条码更正、聚合和筛选`：更正1bp不匹配范围内的单元条码，然后聚合并按最小读取次数（默认值5000）筛选条码，这将生成名为：barcode_u stats_lt；sample>；.csv；
3的有效条码列表。`拆分有效单元格条形码的读取：根据条形码的2bp前缀，将原始对端原始读取拆分为16个单端文件进行多处理；条形码拆分文件夹包含以下文件：拆分。<；sample>；<；aa at ac ag…gg>；.fq；
4。`使用star`：几个（由--parallel/-p定义的）star程序同时运行，结果将在名为star_align的文件夹中；bam文件按序列头进一步排序；
5。` reads taging`：将读取对齐位置标记到相应的基因名；
6。`生成表达式表：将生成由umi（result.umis<；sample>；.txt）量化的表达式表和原始读取计数（result.reads<；sample>；.txt）；

run pipeline

应提供以下参数：（或run:baseqdrops run pipe--有关信息的帮助）

+`--outdir/-d`：输出路径（默认。/，结果将存储在./<；name>；）；
+`--config`：配置文件的路径；
+`--genome/-g`：基因组版本[hg38/mm38/hgmm]；
+`--protocol/-p`：[10x indrop dropseq]；
+`--minreads`：条形码所需的最小读取量；
+`--name/-n`：样本名称，将创建一个文件夹<；outdir>；/<；name>；作为主目录；
+`--parallel`：同时运行星型进程和标记进程的数目（默认值为4，需要更多内存才能获得更大的并行数）；
+`--fq1/-1`：对端1排序文件的路径；
+`--fq2/-2`：对端2排序文件的路径；
+`--top-million-reads`：对于大型数据集，可以选择使用部分数据快速查看，超过N百万的读取将被跳过；

_hg38`已在配置文件中定义，您可以运行：

baseqdrops run pipe--config./config_drops.ini-g hg38-p 10x--minreads 1000-n 10x_test-1 10x_1.1.fq.gz-2 10x.2.fq.gz-d./

所有参数应按上述要求提供，应提供额外的"-step"，例如：

baseqdrops run pipe--config./config.ini-g hg38-p dropseq--minreads 1000-n dropseq2--top_百万读取20-1 dropseq_1.1.fq.gz-2 dropseq.2.fq.gz--step count-d./

修正，聚合和筛选`--step stats
+`split the reads of valid cell barcode`:--step split
+`alignment to genome using star`:--step star
+`reads tagging`:--step tagging
+`generating expression table`:--step table

请发邮件至：friedpine@gmail.com

欢迎加入QQ群-->： 979659372

baseqDrops 2.0

baseqDrops的Python项目详细描述

推荐PyPI第三方库

django-members-roles

Zwiki

KlarnaCheckout

screwdriver

atilla

unweb.watermark

fritz

pymysqlslave

pyFneko

data-catapult

underscode

GameOfLife

juicebox-cli

Products.UnicodeLexicon

itweet

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

baseqDrops 2.0

baseqDrops的Python项目详细描述

推荐PyPI第三方库

django-members-roles

Zwiki

KlarnaCheckout

screwdriver

atilla

unweb.watermark

fritz

pymysqlslave

pyFneko

data-catapult

underscode

GameOfLife

juicebox-cli

Products.UnicodeLexicon

itweet

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签