要通过翻译具有适当遗传编码表的蛋白质编码基因(pcg)来过滤序列,如果其中一个pcg具有内部终止密码子,则过滤掉该序列。请参见https://github.com/linzhi2013/breaksequins_then_translate
breakSeqInNs-then-translate的Python项目详细描述
breaksequins_然后_translate
1简介
breakSeqInNs_then_translate
是一种通过翻译具有适当遗传编码表的蛋白质编码基因(pcg)来筛选序列的工具,如果其中一个pcg具有内部终止密码子,则筛选出该序列。孟冠良,见https://github.com/linzhi2013/breakSeqInNs_then_translate。
2安装
pip install breakSeqInNs_then_translate
将在与您的pip
命令相同的目录下创建命令breakSeqInNs_then_translate
。
3用法
$ breakSeqInNs_then_translate
usage: breakSeqInNs_then_translate.py [-h] [-seq <seq>] [-seqfile <file>]
[-seqformat {fa,gb}] [-code <int>] [-nb]
[-gb_genes <int>] [-maxStopGenes <int>]
Filter the sequences by translating the protein coding genes (PCGs) with
proper genetic code table, if one of the PCGs has interal stop codon, filter
out this sequence. Beware: if the seq has Ns, then this script will translate
the sub seqs with three frames (0, 1, 2), only when all these three kinds of
frames have interal stopCodon the seq will be treated as have InternalStop!.
By Guanliang MENG, see
https://github.com/linzhi2013/breakSeqInNs_then_translate
optional arguments:
-h, --help show this help message and exit
-seq <seq> input sequence
-seqfile <file> input fasta or genbank file
-seqformat {fa,gb} input -seqfile format [fa]
-code <int> genetic code table [1]
-nb do not break sequence in Ns when translate [break]
-gb_genes <int> if a genbank record has no less than -gb_genes PCGs and
no more than -maxStopGenes PCGs has InternalStops, we
keep this record. But if a genbank record has less than
-gb_genes PCGs, then we will discard this record if any
of its PCGs has InternalStops. This is because we may
want to tolerate some assembly/sequencing errors. This
has no effect on fasta input file. [5]
-maxStopGenes <int> the maximum number of PCGs can have InternalStops in a
genbank record if do not discard it [0]
4作者
孟冠良
5条引文
目前我没有计划发布breakSeqInNs_then_translate
。
但是,由于breakSeqInNs_then_translate
使用Biopython
,如果在工作中使用breakSeqInNs_then_translate
,也应该引用它:
Peter J. A. Cock, Tiago Antao, Jeffrey T. Chang, Brad A. Chapman, Cymon J. Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, Michiel J. L. de Hoon: “Biopython: freely available Python tools for computational molecular biology and bioinformatics”. Bioinformatics 25 (11), 1422–1423 (2009). https://doi.org/10.1093/bioinformatics/btp163
有关详细信息,请转到http://www.biopython.org/
。