chip-r是一种评估重复芯片seq型实验再现性的方法。它结合了秩积法,一种新的阈值方法,并使用峰值碎片返回最可重复的峰值。
ChIP-R的Python项目详细描述
chip-r(“削片机”)
chip-r使用秩积统计的自适应性,通过合并来自多个chip-seq复制的信息和“分段”峰值位置来评估chip-seq峰值的再现性,以便更好地组合跨复制的信息。
安装
- Python3.x使用以下软件包:
- 纽比
- scipy
- 侏儒
要安装chip-r:
pip install ChIP-R
或者如果要从源安装:
git clone https://github.com/rhysnewell/ChIP-R.git
cd ChIP-R
python3 setup.py install
用法
在命令行中,键入'chipr-h'了解详细用法
$ chipr -h
usage: chipr [-h] -i INPUT [INPUT ...] [-o OUTPUT] [-m MINENTRIES]
[--rankmethod RANKMETHOD] [--duphandling DUPHANDLING]
[--seed RANDOM_SEED] [-a ALPHA]
Combine multiple ChIP-seq files and return a union of all peak locations and a
set confident, reproducible peaks as determined by rank product analysis
optional arguments:
-h, --help show this help message and exit
-i INPUT [INPUT ...], --input INPUT [INPUT ...]
ChIP-seq input files. These files must be in either
narrowPeak, broadPeak, or regionPeak format. Multiple
inputs are separeted by a single space
-o OUTPUT, --output OUTPUT
ChIP-seq output filename prefix
-B, --bigbed Specify if input files are in BigBed format
-m MINENTRIES, --minentries MINENTRIES
The minimum peaks between replicates required to form
an intersection of the peaks Default: 1
--rankmethod RANKMETHOD
The ranking method used to rank peaks within
replicates. Options: 'signalvalue', 'pvalue',
'qvalue'. Default: pvalue
--duphandling DUPHANDLING
Specifies how to handle entries that are ranked
equally within a replicate Can either take the
'average' ranks or a 'random' rearrangement of the
ordinal ranks Options: 'average', 'random' Default:
'average'
--seed RANDOM_SEED Specify a seed to be used in conjunction with the
'random' option for -duphandling Must be between 0 and
1 Default: 0.5
-a ALPHA, --alpha ALPHA
Alpha specifies the user cut-off value for set of
reproducible peaks The analysis will still produce
results including peaks within the threshold
calculatedusing the binomial method Default: 0.05
示例
$ chipr -i input_prefix1.bed input_prefix2.bed input_prefix3.bed input_prefix4.bed -m 2 -o output_prefix
输出
重要结果文件:
- prefixname_all.bed:所有相交的峰值,从最高有效列到最低有效列(10列)
- prefixname_t2.bed:第2层相交的峰值,在二项式阈值(10列)内的峰值
- prefixname_T1.bed:第1层相交的峰值,在用户定义的阈值内的峰值(10列)
- prefixname_log.txt:包含每个层中出现的峰值数的日志。
prefixname.bed文件有10列。输出遵循床文件的标准峰值格式,并添加第10列,指定产生此可能峰值的峰值的列请参见下面的玩具示例。
chr | start | end | name | score | strand | signalValue | p-value | q-value |
---|---|---|---|---|---|---|---|---|
chr1 | 9118 | 10409 | T3_peak_87823 | 491 | . | 15.000000 | 0.113938 | 0.712353 |
引文
联系人
作者:Rhys Newell、Michael Piper、Mikael Boden、Alexandra Essebier
联系人:rhys.newell(at)uq.edu.au