Python APEC包_程序模块 - PyPI

基于可达性模式的单细胞表观基因组聚类

APEC的Python项目详细描述

APEC用户指南（v1.1.0）

（基于可达性模式的表观基因组聚类）

apec可以对来自scatac-seq、snatac-seq、sciatac-seq或任何其他相关实验的单细胞染色质可及性数据进行精细的细胞类型聚类。它还可以用于评估相关基因的表达，为每个细胞簇寻找差异的基序/基因，寻找超级增强子，并构建伪时间轨迹（通过调用monocle）。如果用户已经从其他映射管道（如cellranger）获取了每个峰值矩阵的片段计数，请从第一部分"从片段计数矩阵运行apec"运行apec。如果用户只有原始的fastq文件，请跳到第二部分"从原始数据获取片段计数矩阵"。

从片段计数矩阵运行AEPC

＜H3＞1。要求和安装

1.1要求

apec需要linux系统（centos 7.3+或ubuntu 16.04+），以及python（2.7.15+或3.6.8+）。如果用户想用apec建立伪时间轨迹，请安装r（3.5.1）环境和monocle（2.10.0）。此外，APEC还需要以下软件：

Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html
Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web
Homer: http://homer.ucsd.edu/homer/

注意：用户需要通过"perl/path to homer/configurehomer.pl-install hg19"和"perl/path to homer/configurehomer.pl-install mm10"下载Homer的基因组参考资料。

apec需要reference文件夹中的文件。但我们没有将引用文件上载到github，因为它们太大。用户可以从http://galaxy.ustc.edu.cn:30803/apec/" rel="nofollow">http://galaxy.ustc.edu.cn:30803/apec/下载所有参考文件。参考文件夹应包含以下文件：

hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai,
mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai,
JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt

1.2安装并导入apec
用户可以通过以下方式安装apec：
pip install APEC==1.1.0.8
由于兼容性问题（特别是对于rpy2），我们不建议使用conda环境。用户可以使用pyenv构建apec的子环境。如果用户希望调用paga（而不是monocle）来构造伪时间轨迹，请在python3环境中使用apec并安装以下软件包：
pip install scanpy anndata
在ipython、jupyter笔记本或python脚本中，用户可以通过以下方式导入apec包：
from APEC import clustering,plot,generate
用户可以使用ipython或jupyter中的"help（）"查询apec的每个功能手册，例如：
help(clustering.cluster_byAccesson)
＜H3＞2。输入数据

用户需要准备一个项目文件夹（称为"$project"），其中包含矩阵、峰值、结果和图文件夹。请将"filtered_cells.csv"和"filtered_reads.mtx"放在matrix文件夹中，将"top_filtered_peaks.bed"放在peak文件夹中。以下是三个输入文件的说明：

filtered_cells.csv: Two-column (separated by tabs) list of cell information ('name' and 'notes'):
                    The 'name' column stores cell names (or barcodes); the 'notes' column can be cell-type,
                    development stage, batch index or any other cell information, such as:
                    	name    notes
                    	CD4-001 CD4
                    	CD4-002 CD4
                    	CD8-001 CD8
                    	CD8-002 CD8
top_filtered_peaks.bed: Three-column list of peaks, which is a standard bed format file.
                        It is similar to the "peaks.bed" file in the CellRanger output of a 10X scATAC-seq dataset.
filtered_reads.mtx: Fragment count matrix in mtx format, where each row is a peak and each column represents a cell.
                    It is similar to the "matrix.mtx" file in the CellRanger output of a 10X scATAC-seq dataset.
                    The order of cells should be the same with "filtered_cells.csv", and the order of peaks should
                    be the same with "top_filtered_peaks.bed".

＜H3＞3。亚太经合组织的职能（逐步）
3.1按亚太经合组织分类
使用以下代码按APEC算法对单元格进行群集：
clustering.build_accesson('$project', ngroup=600) clustering.cluster_byAccesson('$project', norm='zscore')
输入参数：
ngroup: Number of accessons, default=600. nc: Number of cell clusters, set it to 0 if users want to predict cluster number by Louvain algorithm, default=0. norm: Normalization method for accesson matrix, can be 'zscore' or 'probability', default='zscore'. filter: Filter high dispersion accessons or not, can be 'yes' or 'no', default='yes'.
输出文件：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
0
然后用户可以绘制单元格的tsne、umap或相关热图：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
1
输入参数：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
2
输出文件：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
3
Tsne_by_apec_with_notes_label.pdf
cell_cell_correlation_by_apec_with_notes_label.png
3.2按chromvar聚类（可选，模体分析所需）
使用以下代码按chromvar算法对单元格进行聚类：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
4
输入参数：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
5
输出文件：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
6
3.3评价者e ari、nmi和ami用于聚类结果
如果用户在"$project/matrix/filtered_cells.csv"的"notes"列中有真实的单元格类型，请使用以下代码计算ari、nmi和ami，以估计聚类算法的准确性。
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
7
输出的ari、nmi和ami值将直接显示在屏幕上。请确保filtered_cells.csv包含每个单元格的FACS标签。对于某些数据集，如造血细胞，在计算ari之前，用户应忽略所有"未知"细胞。
3.4生成伪时间轨迹
默认情况下，APEC使用Monocle从Accesson Matrix生成伪时间轨迹：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
8
输入参数：
Bedtools: http://bedtools.readthedocs.io/en/latest/content/installation.html Meme 4.11.2: http://meme-suite.org/doc/download.html?man_type=web Homer: http://homer.ucsd.edu/homer/
9
输出文件：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
0
在python3环境中，用户还可以使用paga生成轨迹：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
1
输出文件：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
2
带有注释标签的伪时间轨迹pdf
3.5产生基因表达
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
3
输入参数：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
4
输出文件：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
5
用户还可以通过tss区域周围的峰值来评分基因：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
6
输出文件：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
7
3.6为单元簇生成差异特征
获取差异访问：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
8
获取差异基序/基因：
hg19_RefSeq_genes.gtf, hg19_chr.fa, hg19_chr.fa.fai, mm10_RefSeq_genes.gtf, mm10_chr.fa, mm10_chr.fa.fai, JASPAR2018_CORE_vertebrates_non-redundant_pfms_meme.txt, tier1_markov1.norc.txt
9
输入参数：
pip install APEC==1.1.0.8
0
输出文件：
pip install APEC==1.1.0.8
1
3.7在tsne/轨迹图上绘制基序/基因
pip install APEC==1.1.0.8
2
输入参数：
pip install APEC==1.1.0.8
3
输出文件：
pip install APEC==1.1.0.8
4
吉恩·福克索伊（Gene_Foxo1_on_Tsne_by_Apec.pdf）
Motif_gata1_on_trailway_by_apec.pdf
注意：TSNE图上的绘图功能需要预先运行plot.plot_tsne（）（见3.1），而轨迹上的绘图功能需要预先运行generate.monocle_tracky（）（见3.4）。
3.8生成潜在的超级增强剂
pip install APEC==1.1.0.8
5
输入参数：
pip install APEC==1.1.0.8
6
输出文件：
pip install APEC==1.1.0.8
7
从原始数据中获取片段计数矩阵
（此部分仅在github上可用：https://github.com/qukunlab/apec）
＜H3＞1。要求和安装
以下所有软件都需要放在linux系统的全局环境中，以确保可以在任何路径/文件夹中调用它们。picard也是必需的，但是我们已经将它放入$apec/reference文件夹中，用户不需要安装它。我们建议用户采用这些软件的最新版本，但Meme（4.11.2版）除外。
pip install APEC==1.1.0.8
8
1.2安装
用户只需将code_v1.1.0文件夹和reference文件夹复制到同一路径即可安装此部件。用户必须在代码v1.1.0/中直接运行apec prepare_steps.sh，因为每个程序都将自动调用引用文件。引用文件夹是必需的，但是我们没有将引用文件上载到github，因为它们太大。用户可以从http://galaxy.ustc.edu.cn:30803/apec/" rel="nofollow">http://galaxy.ustc.edu.cn:30803/apec/下载所有参考文件。参考文件夹应包含以下文件：
pip install APEC==1.1.0.8
9 ＜H2＞2。片段计数矩阵
2.1原始数据的安排
原始数据文件夹应包含将所有原始排序fastq文件放入。所有这些双端fastq文件都应命名为：
pip install scanpy anndata
0
其中"_1"和"_2"表示对端排序的正向和反向读取。{type1，type2，…}可以是单元格类型或样本批次，例如{gm，k562，…}，或{batch1，batch2，…}，或任何其他不带下划线的单词"\u"或短划线"-"。用户需要构建一个项目文件夹来存储结果。work、matrix、peak和figure文件夹将通过后续步骤自动生成，并放置在project文件夹中。
2.2基质制备的简单操作
用户可以使用脚本apec prepare_steps.sh完成从原始数据到片段计数矩阵的过程。此脚本包括"修剪"、"映射"、"峰值调用"、"对齐读取计数矩阵"和"质量控制"等步骤。在我们的示例项目（即带有672个单元格的Project01）上运行此步骤将在8核32GB计算机上花费10到20个小时，因为序列映射步骤是最慢的步骤。
示例：
pip install scanpy anndata
1
输入参数：
pip install scanpy anndata
2
输出文件：
脚本apec_prepare_steps.sh将生成包含许多输出文件的work、peak、matrix和figure文件夹。这里，我们只介绍对用户有用的文件。对于我们的示例项目，所有这些结果都可以在通用计算机系统上复制。
（1）在工作文件夹中：
对于每个单元格，映射步骤可以在work文件夹中生成一个子文件夹（具有单元格名称）。每个子文件夹中都有几个有用的文件：
pip install scanpy anndata
3
（2）在峰值文件夹中：
pip install scanpy anndata
4
（3）在矩阵文件夹中：
pip install scanpy anndata
5
（4）在图文件夹中：
pip install scanpy anndata
6
标签：
http
基因组
模式
单细胞
聚类
fa
表观
chr
hg19
欢迎加入QQ群-->： 979659372
推荐PyPI第三方库
metadata_toolbox
管理语料库及其元数据的工具箱
cassandra-migrator
卡珊拉迁移工具
ebmdatalab
ebmdatalab jupyter笔记本包
auto-rsync
通过监视文件系统事件自动rsync。
yaleorgdirector
用于从耶鲁OrgDirectoryAPI获取数据的库。
python-awesome-decorators
python中有用的装饰器列表。
cloudshell-tg-teravm
QualitSystems Python包
spm-kernel
用于Salford Predictive Miner（SPM）的Jupyter内核
xyz
没有项目描述
googlecloudcore
防止利用的软件包
django-articleappkit
一组类，使创建基于文本或文章的django应用程序更容易。
eastdetector
东文本检测器
suminb-spider
没有项目描述
PyKDL
kdl的python绑定
restbase
用于构建用于与rest api通信的api客户端的库。

导航栏
项目描述
版本历史
下载文件
项目链接
首页
标签
许可证: BSD许可证（BSD 3条款）
作者信息:: 暂无
维护者
libinsnet
最新PyPI项目
italian_vip_says
UFx
vofs
fake_item_generator
NerEva
django-monologue
fio_product_attribute_strict
climailsystem
pyshape
tbb-devel
npy-append-arra
anthill.tal.macrorenderer
odoo11-addon-stock-a
uuuu
contextil
fyl_nester
appomatic_renderable
teacher
chuletas
slackbot_ce
最新Python常见问题
如何实现一个类，该类在每次更改其属性时更改其“last_edited”变量？
如何实现一个类？
如何实现一个类的属性设置？
如何实现一个能够存储输入并反复访问输入的存储系统？GPA计算器
如何实现一个自定义的keras层，它只保留前n个值，其余的都归零？
如何实现一个行为类似于Python中序列的最小类？
如何实现一个请求的多线程或多处理
如何实现一个长时间运行的、事件驱动的python程序？
如何实现一个颜色一致的非舔深度地图实时？
如何实现一个默认的SQLAlchemy模型类，它包含用于继承的公共CRUD方法？
如何实现一次热编码的生成函数
如何实现一种在数组中删除对的方法
如何实现一类支持向量机用于图像异常检测
如何实现一维阵列到二维阵列的复制转换
如何实现三维三次样条插值？

APEC 1.1.0.8

APEC的Python项目详细描述

APEC用户指南（v1.1.0）

从片段计数矩阵运行AEPC

3.2按chromvar聚类（可选，模体分析所需）

3.3评价者e ari、nmi和ami用于聚类结果

3.4生成伪时间轨迹

3.6为单元簇生成差异特征

3.8生成潜在的超级增强剂
`pip install APEC==1.1.0.8`
5
输入参数：
`pip install APEC==1.1.0.8`
6
输出文件：
`pip install APEC==1.1.0.8`
7

从原始数据中获取片段计数矩阵

（此部分仅在github上可用：https://github.com/qukunlab/apec）

1.2安装

2.1原始数据的安排

2.2基质制备的简单操作

推荐PyPI第三方库

metadata_toolbox

cassandra-migrator

ebmdatalab

auto-rsync

yaleorgdirector

python-awesome-decorators

cloudshell-tg-teravm

spm-kernel

xyz

googlecloudcore

django-articleappkit

eastdetector

suminb-spider

PyKDL

restbase

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

APEC 1.1.0.8

APEC的Python项目详细描述

APEC用户指南（v1.1.0）

从片段计数矩阵运行AEPC

3.2按chromvar聚类（可选，模体分析所需）

3.3评价者e ari、nmi和ami用于聚类结果

3.4生成伪时间轨迹

3.6为单元簇生成差异特征

3.8生成潜在的超级增强剂 pip install APEC==1.1.0.8 5 输入参数：pip install APEC==1.1.0.8 6 输出文件：pip install APEC==1.1.0.8 7

从原始数据中获取片段计数矩阵

（此部分仅在github上可用：https://github.com/qukunlab/apec）

1.2安装

2.1原始数据的安排

2.2基质制备的简单操作

推荐PyPI第三方库

metadata_toolbox

cassandra-migrator

ebmdatalab

auto-rsync

yaleorgdirector

python-awesome-decorators

cloudshell-tg-teravm

spm-kernel

xyz

googlecloudcore

django-articleappkit

eastdetector

suminb-spider

PyKDL

restbase

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

3.8生成潜在的超级增强剂
`pip install APEC==1.1.0.8`
5
输入参数：
`pip install APEC==1.1.0.8`
6
输出文件：
`pip install APEC==1.1.0.8`
7

导航栏

项目链接

标签