用于简化MAPP预测分析工作流的包。
Mapp的Python项目详细描述
此包用于简化 MAPP program。
先决条件
您必须安装:
- python 2.7版
- MAPP program。
- 可选择下载fasta database (或者可以远程使用hmmer或blast程序)。
- 安装序列搜索程序。推荐使用HMMER。 或者可以下载BLAST。
- 安装多序列比对程序(推荐使用MAFFT)。
- 安装系统发育树构建程序(推荐使用FastTree)。
- 安装Biopython包(如果要使用文件转换工具)。
设置文件
对于mapp分析,需要设置文件(语法为python ConfigParser)。
设置文件包含命令和命令的输入/输出。 一个命令应该是一个基本的shell命令(不允许使用分号等)。
设置文件可以很容易地由另一个脚本创建,然后很容易运行 同时进行多重MAPP分析。
settings.conf所需的基本和最小结构是:
[commands] #id used for name of analysis id= #sequence file in fasta format sequence = #output file from blast or hmmer program blastout = #file where programs store stats about sequence picking blaststat = # blast command to run blast = # input file for multiple sequence aligning program (same as blast output) msain = %(blastout)s # msa program output file msaout = # this command is executed before msa program is executed # (it could be for blast output conversion and purifying) beforemsa = # msa program command msa = # tree program input file (converted msaout file to newick format) treein = # tree program output file treeout = # this command is executed before the tree program beforetree = # tree program command tree = # mapp program input msa file (in fasta format - could be the same as msaout value) mappinmsa = # mapp program input tree file (in newick format) mappintree = # mapp program output file mappout = # command before the mapp command (good for e.g. adjust tree to proper newick format) # MAPP is really sensitive to proper file format beforemapp = # MAPP command itself mapp = java -jar MAPP.jar -f %(mappinmsa)s -t %(mappintree)s -o %(mappout)s
python程序中的基本用法
基本python代码:
from mapp.core.analyzers import Analyzer from mapp.core.parsers import SettingsParser from mapp.core.exceptions import MappError settings_file = 'settings.conf' def main(): try: mapp = Analyzer(SettingsParser(settings_file)) mapp.exec_mapp() except MappError as e: print(e.header) print(e.description) if __name__ == '__main__': main()
帮助程序文件
在mapp.utils包中,有一些文件可以用作独立脚本或模块。这个文件主要用于文件转换和净化。