连锁不平衡估计包
ld-estimator的Python项目详细描述
ld_估计器:一个估计连锁不平衡的包
计算python中的链接不平衡。这使用了最大似然 Excoffier & Slatkin, 刚刚从HaploView转换为c++ ,使用python绑定。速度不太慢,每台可以计算1000对 第二。
安装
安装ld_估计器的最简单方法是通过pip:
pip install ld_estimator
使用量
在python环境中使用ld_估计器
fromld_estimatorimportpairwise_ldvar1=[(0,0),(0,0),(0,1),(1,0),(1,1),(0,1),(0,0),(0,0),(1,1)]var2=[(0,0),(0,0),(0,1),(1,0),(1,1),(0,1),(0,0),(1,1),(1,1)]is_haploid=[False,False,False,False,False,False,False,True,True,True]ld=pairwise_ld(var1,var2,is_haploid)print(ld.dprime)print(ld.r_squared)# or calculate LD for all pairs of variants in a region in a VCF:fromld_estimatorimportregion_ldvcf_path='PATH_TO_VCF'ld=region_ld(vcf_path,chrom,start,end)# or calculate LD to a site within a region in a VCF. This defaults to checking# variants within a 100 kb window of the specified site.fromld_estimatorimportsite_ldvcf_path='PATH_TO_VCF'ld=site_ld(vcf_path,chrom,pos,window=200000)# can pass in multiple positions in the same region at onceld=site_ld(vcf_path,chrom,[pos2,pos2,pos3],window=200000)# both region_ld() and site_ld() can take a list of sample IDs to subset the# samples used for calculating LD. For example:ld=site_ld(vcf_path,chrom,pos,subset=['sample1','sample2'])# if the variant is on a sex chromosome, you'll have to pass in a list of sample# sexes (matching order of the subset IDs if present, otherwise the VCF samples)ld=site_ld(vcf_path,'X',20000000,sexes=['male','female'])