使用kraken2/离心机分类结果筛选感兴趣分类单元的读数

filter-classified-reads的Python项目详细描述


过滤分类读取

https://img.shields.io/pypi/v/filter_classified_reads.svghttps://travis-ci.com/peterk87/filter_classified_reads.svg?branch=masterDocumentation Status

使用kraken2/离心分类结果筛选感兴趣分类单元的读数。

功能

  • 对分类为感兴趣类群的读取的过滤器:{a5}和Centrifuge(默认为病毒读数筛选(TasIDID=10239))
  • 输出未分类的读取和来自感兴趣分类单元的读取使用排除它们–排除未分类的
  • screed用于快速过滤读取

使用量

Kraken2Centrifuge将结束读取与分类结果配对

filter_classified_reads -i /path/to/reads/R1.fq \
                        -I /path/to/reads/R2.fq \
                        -o  /path/to/reads/R1.filtered.fq \
                        -O  /path/to/reads/R2.filtered.fq \
                        -k  /path/to/kraken2/results.tsv \
                        -K  /path/to/kraken2/kreport.tsv \
                        -c  /path/to/centrifuge/results.tsv \
                        -C  /path/to/centrifuge/kreport.tsv \

使用测试/data/中的测试数据:

$ filter_classified_reads -i tests/data/SRR8207674_1.viral_unclassified.seqtk_seed42_n10000.fastq.gz \
                          -I tests/data/SRR8207674_2.viral_unclassified.seqtk_seed42_n10000.fastq.gz \
                          -o r1.fq \
                          -O r2.fq \
                          -k tests/data/SRR8207674-kraken2_results.tsv \
                          -K tests/data/SRR8207674-kraken2_report.tsv \
                          -c tests/data/SRR8207674-centrifuge_results.tsv \
                          -C tests/data/SRR8207674-centrifuge_kreport.tsv

您应该看到以下日志信息:

2019-04-16 13:40:34,114 INFO: Parsing centrifuge results into DataFrame [in target_classified_reads.py:49]
2019-04-16 13:40:34,168 INFO: Parsed n=12281 centrifuge result records into DataFrame from "tests/data/SRR8207674-centrifuge_results.tsv" [in target_classified_reads.py:57]
2019-04-16 13:40:34,172 INFO: Parsed n=298 centrifuge Kraken-style report records into DataFrame from "tests/data/SRR8207674-centrifuge_kreport.tsv" [in target_classified_reads.py:60]
2019-04-16 13:40:34,177 INFO: Found 7129 unclassified reads from Centrifuge results [in target_classified_reads.py:65]
2019-04-16 13:40:34,242 INFO: Found 231 unique viral Taxonomy IDs [in target_classified_reads.py:98]
2019-04-16 13:40:34,245 INFO: Found 2181 target reads from centrifuge results [in target_classified_reads.py:101]
2019-04-16 13:40:34,245 INFO: Parsing kraken2 results into DataFrame [in target_classified_reads.py:49]
2019-04-16 13:40:34,289 INFO: Parsed n=20000 kraken2 result records into DataFrame from "tests/data/SRR8207674-kraken2_results.tsv" [in target_classified_reads.py:57]
2019-04-16 13:40:34,293 INFO: Parsed n=139 kraken2 Kraken-style report records into DataFrame from "tests/data/SRR8207674-kraken2_report.tsv" [in target_classified_reads.py:60]
2019-04-16 13:40:34,295 INFO: Found 1737 unclassified reads from Centrifuge results [in target_classified_reads.py:65]
2019-04-16 13:40:34,325 INFO: Found 26 unique viral Taxonomy IDs [in target_classified_reads.py:98]
2019-04-16 13:40:34,331 INFO: Found 8345 target reads from kraken2 results [in target_classified_reads.py:101]
2019-04-16 13:40:34,332 INFO: Found N=1701 common unclassified reads by all classification methods. [in cli.py:110]
2019-04-16 13:40:34,333 INFO: Total viral reads=8357 [in util.py:37]
2019-04-16 13:40:34,333 INFO: Centrifuge found n=12 target reads not found with Kraken2 [in util.py:38]
2019-04-16 13:40:34,333 INFO: Kraken2 found n=6176 target reads not found with Centrifuge [in util.py:40]
2019-04-16 13:40:34,338 INFO: N=1701 reads unclassified by both Centrifuge and Kraken2. [in util.py:62]
2019-04-16 13:40:34,345 INFO: Writing n=9999 filtered reads from "tests/data/SRR8207674_1.viral_unclassified.seqtk_seed42_n10000.fastq.gz" to "r1.fq" [in cli.py:129]
2019-04-16 13:40:34,957 INFO: Writing n=9999 filtered reads from "tests/data/SRR8207674_2.viral_unclassified.seqtk_seed42_n10000.fastq.gz" to "r2.fq" [in cli.py:134]
2019-04-16 13:40:35,459 INFO: Done! [in cli.py:137]

学分

这个包是用Cookiecutteraudreyr/cookiecutter-pypackage项目模板创建的。

历史记录

0.1.0(2019-04-15)

  • pypi上的第一个版本。

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
Java持久性和Foxpro   javajavax。命名。NameNotFoundException:com。javacodegeeks。实例服务AccountServiceRemote   java在不重复字符的情况下查找最长子字符串的长度   控制台Java runtine。执行官:不会改变路径   Java继承混乱,超类和子类成员变量同名   循环如何格式化在Java中循环的打印行   使用Jersey/Glassfish实现java正确的CDI注释   多线程Java创建一个连续线程数组   java根据特殊字符(逗号除外)验证字符串   安卓 JNI NewStringUTF调用了挂起的异常“java”。lang.NoSuchMethodError'   java在JSweet转换后运行脚本   java为什么$MockitoMock$实例没有被标识为mock?   用JavaJNA编写的密钥侦听器。无法停止线程   从Java代码创建的安卓视图包装在另一个视图中。为什么?   在另一个类中使用带有逻辑的JavaSwingGUI   java致命异常:Timer0?   java JavaFX在tableview中移动列   spring将jboss 6.0.0上的Hibernate 3.6升级为Hibernate 4.3.6,以获取java。lang.NoClassDefFoundError:org/hibernate/classic/Session   ImageView中的java图像是拉伸的   java我想扩展枚举和对象(通用)