linux下snp模式的上位机模拟
EpistaSim_Linux的Python项目详细描述
epistasim是一个linux仿真器,它可以在单倍型选择模型下,通过前向和后向的过程,结合突变和重组,估计单倍型频率并模拟与目标两个位点相连区域的dna序列。 软件的输出与hudson的ms软件相似(hudson,1990)。EpistaSim是一种灵活的仿真器,可以引入不同的上位模型。软件根据历史向前运行并合并模拟 (轨迹)单倍型频率,与文本文件中该区域的DNA序列一起输出。epistasim包括前后两部分的仿真。
下载并安装
下载软件包“epistasim_linux”并通过以下命令解压缩:
tar -zxvf EpistaSim_Linux-1.1.0.tar.gz
运行epistasim
选项:
参数摘要描述如下。如果未指定参数值,则将使用默认值或随机值。
其中0是祖先等位基因,1是衍生等位基因。
Switch | Argument | Comments(Default) |
---|---|---|
-n | Number of samples in the simulation | 30 |
-d | Number of replication of the simulation | 1 |
-l | Length of the simulated region (bp) | 1000 |
-g | Number of the generation in forward process | 200 |
-t | Positions of two selective loci | Random |
-p | Haplotype frequency with order 00, 01, 10 and 11 | Must be specified |
-R | Recombination rate per generation per bp | 3*10(-8) |
-u | Mutation rate pre generation per bp | 3*10(-8) |
-e | Number of segsites in the simulated region | Random |
-M | Epistasis model in forward process (M1, M2, M3, M4) | M1 |
-H | Selective Haplotype or allele | 11 |
-S | Slective Coefficient | |
-o | Outputfile of DNA sequence in the simulated region | |
-f | Outputfile of haplotype frequency trajectories |
转发示例:
使用fowlowing命令,通过正向过程模拟dna序列和单倍型频率:
cd ~/EpistaSim_Linux-1.1.0 python ForWard.py -n 10 -d 5 -l 1000 -g 200 -t 10 100 -p 0.25 0.25 0.25 0.25 -R 0.0000001 -u 0.0000001 -e 10 -M M2 -H 11 -S 0.01 -o forwardsimulation.out -f Hapfre.trac The running information of Forward was illustrated as follow: Generate the initial population Print the track file of haplotype frequency Simulation the offspring simulation the 0th replication A region of 1000bp include 12 segsites were simulated for 200 generations with sample size 10 for 1 replication. ..........
向后:
使用fowlowing命令,通过结合过程模拟dna序列和单倍型频率:
cd ~/EpistaSim_Linux-1.1.0 python BackWard.py -n 10 -d 5 -l 1000 -t 10 100 -p 0.3 0.1 0.1 0.5 -R 0.0000001 -u 0.0000001 -e 10 -H 11 -S 0.01 -o backwardsimulation.out -f Hapfreback.trac The running information of Forward was illustrated as follow: Print the track file of haplotype frequency Simulation the offspring simulation the 0th replication A region of 1000bp include 9 segsites were simulated with sample size 10 for 1 replication. ..........
- 注
- 除了-p之外的参数可以使用默认值。
epistasim的输出
正向和反向的输出是sames,与hudson的ms软件(hudson,1990)类似。
根据up参数,结果为fowllow:
输出dna序列
// Segsites: 12 Selected two_locus: 10 100 Positions: 10 100 125 158 258 309 472 631 756 818 858 886 111011011111 111011011110 111010010110 111111000111 111001011011 101100011111 001110011111 001000011111 001011011111 011100000001 // ........
单倍型频率输出
// T 00 01 10 11 0 0.25 0.25 0.25 0.25 1 0.248079102592 0.253290720757 0.254300434135 0.244329742516 2 0.250079789017 0.260533576401 0.254926425626 0.234460208956 3 0.247683161282 0.257852724706 0.259331708331 0.235132405681 ...... 199 0.174483716477 0.147128461696 0.166087155013 0.512300666814 200 0.175953877557 0.145569161198 0.163958437969 0.514518523277