在python中,每隔n次从多个fasta文件获取序列

2024-06-12 05:32:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含4000个序列的多重fasta文件。我想每n次随机获得1个序列(用户定义)。所以,如果n=5,我将取第一个序列,然后是第六个,第十一个,直到它到达文件的末尾。每个删除的序列都会记录在另一个fasta文件中

我编写以下代码:

infile = sys.argv[2]                                #Name of the input file
seq = list(SeqIO.parse(infile,"fasta"))             #Create a list with all the sequence records
print "Input fasta file = ", infile

totseq = len(seq)                                   #Total number of sequences in the input file
print "Number of sequences in the original file = ", totseq

range = int(sys.argv[1])                          #Number of random sequences desired
print "Number of sequences picked = ", range

outfile = sys.argv[3]                               #Name of the output file
print "Output fasta file = ", outfile

outseq = []
outlist = []
print "Choosing output sequences:"

for i in infile:
  choose = [random.randint(1,totseq-1) for i in randseq]
  outrandseq.append(choose)
  print choose
  outseq = seq[choose]
  outlist.append(outseq)                            #Append seq record to output list

SeqIO.write(outlist, outfile, "fasta")              #Write the output list to the outfile

exit()

但是我找不到一种方法来进行互动

我想我的问题是:

      choose = [random.randint(1,totseq-1) for i in randseq]
    

错误在于:

  python fasta_extractor.py 5 genesTPS.fa genes_ext.fasta
Input fasta file =  genesTPS.fa
Number of sequences in the original file =  69
Number of random sequences desired =  5
Output fasta file =  genes_ext.fasta
Randomly chosen output sequences:
[52, 68, 35, 47, 68]
Traceback (most recent call last):
  File "fasta_extractor.py", line 37, in <module>
    outseq = seq[choose]
TypeError: list indices must be integers, not list

我不希望在我的范围内有5个序列,我希望每5个序列它选择一个并写入输出文件,直到范围结束。因此,如果我有100个序列,我的输出将由20个随机序列创建。

我将在这里放置一些序列:

>AY999875_1 Streptomyces hygroscopicus subsp_ glebosus strain AS 4_1873 16S ribosomal RNA gene partial sequence
-----------GCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACTACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAACGTCTGGAGACAGGC--GCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGA-------------------------------------------------------------
>AJ781351_1 Streptomyces libani subsp_ rufus 16S rRNA gene type strain LMG 20087
----GCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACTACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAACGTCTGGAGACAGGC--GCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>AB045882_1 Streptomyces platensis gene for 16S rRNA
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATACTGACTACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTACTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAACGTCTGGAGACAGGC--GCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>DQ026662_1 Streptomyces ramulosus strain NRRL B-2714 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAAGC--CGCTTCGGTGGTGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACCACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTAATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAACGTCTGGAGACAGGC--GCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCTTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTCGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>AY999778_1 Streptomyces catenulae strain ISP 5258 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACCACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTAATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACACTGGAGACAGTG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTCGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>DQ442509_1 Streptomyces angustmyceticus strain NRRL B-2347 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAAGC---CCTTCGGG-GTGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTGCACTCTGGGACAAGCCCTGG-AACGGGGTCTAATACCGGATAT-GACTACTGACCGCATGGT-TGGTGGTGGAAAGCTCCG--GCGGTGCAGGATGAGGCCCGCGGCCTATCAGGCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAAAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCGCGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAACGGCCAGAGATGGTC--GCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATGCCGTGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTTGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAA-CCCTTGT-GGAGGGAGCCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>DQ442518_1 Streptomyces libani subsp_ libani strain NRRL B-3446T 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGATCCGGTGCTTGCATCGGGGATTAGTGGCGAACGGGTGAGTAACACGTGAGTAACCTGCCCTTAACTCTGGGATAAGC-CTGGAAACTGGGTCTAATACCGGATAT-GACTCCTCATCGCATGGT-GGGGGGTGGAAAGCTTTATTGTGGTTTTGGATGG-ACTCGCGGCCTATCA-GCTTGTTGGTGAGGTAATGGCTCACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGTGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTTCACGAGGGGCGCAAGCCTGATGCACGCGACCTTCCGCGTGACCGCGGAGGGA---GACGGCCTTCGGGTTGTAAACCTCTTTC-GTAGGGAAGAAGCGAAAGTGAACGGTACCTGCAGAAGAAGCGCCCTTTAAAGTACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTATCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGTTTGTCGCGTCTGCCGTGAAAGTCCGGGGCTCAACTCCGGATCTGCGGTGGGTACGGGCAGACTAGAGTGATGTAGGGGAGACTGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGATGGCGAAGGCAGGTCTCTGGGCATTAACTGACGCTGAGGAGCGAAAGCATGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCATGCCGTAAACGTTGGGCACTAGGTGTGGGGGACATTCCACGTTTTCCGCGCCGTAGCTAACGCATTAAGTGCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGCGGAGCATGCGGATTAATTCGATGCAACGCGAAGAACCTTACCAAGGCTTGACATGGACCGGACCGGGCTGGAAACAGTCCTTCCCCTTTGGGGCCGGTTCACAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTTCCATGTTGCCAGCG-------CGTAATGGCGGGGACTCATGGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGTCTTGGGCTTCACGCATGCTACAATGGCCGGTACAAAGGGTTGCGATACTGTGAGGTGGAGCTAATCCCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTCGCTAGTAATCGCAGATCAGCAACGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCAAGTCACGAAAGTTGGTAACACCCGAAGCCGGTGGCCTAACCCCTTGTGGGAGGGAGCTGTCAAAGGTGGGACTGGCGATTGGGACTAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>DQ442530_1 Streptomyces nigrescens strain NRRL B-12176T 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACTACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGATGTGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGCCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACCCTGGAGACAGGG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTA--------
>AJ621612_2 Streptomyces tubercidicus 16S rRNA gene type strain DSM 40261T
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACTACCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACCCTGGAGACAGGG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATATCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTACGGCTACCGGAAGG
>AJ391816_1 Streptomyces auratus partial 16S rRNA gene type strain NRRL 8097T
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAAGC---CCTTCGGG-GTGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAT-GACACACGACCGCATGGTTTGTGTGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGGGGCCTATCA-GCTTGTTGGTGGGGTAATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCCAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAAAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGAT-CTGACGCTGATGAGCGAAAGCGTGGGGAGCTAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACCCTGGAGACAGGG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCTGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCACCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>DQ026654_1 Streptomyces sioyaensis strain NRRL B-5408 16S ribosomal RNA gene partial sequence
GCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACACACGACCGCATGGTCTGTGTGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCGCGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACCCTGGAGACAGGG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAAGGTGGGACTGGCGATTGGGACGAAGTCGTAACAAGGTAGCCGTACCGGAAGG
>Streptomyces O
---------------------TGC-AGTCGAACGATGAACC--TCCTTCGGGAGGGGATTAGTGGCGAACGGGTGAGTAACACGTGGGCAATCTGCCCTTCACTCTGGGACAAGCCCTGGAAACGGGGTCTAATACCGGATAC-GACCTCCGACCGCATGGTCTGGTGGTGGAAAGCTCCG--GCGGTGAAGGATGA-GCCCGCGGCCTATCA-GCTTGTTGGTGGGGTGATGGCCTACCAAGGCGACGACGGGTAGCCGGCCTGAGAGGGCGACCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGAAAGCCTGATGCA-GCGAC--GCCGCGT-------GAGGGA--TGACGGCCTTCGGGTTGTAAACCTCTTTCAGCAGGGAAGAAGCGAGAGTG-ACGGTACCTGCAGAAGAAGCGCCGGCTAAC-TACGTGCCAGCAGCCGCGGTAATACGTAGGGCGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGCGGCTTGTCACGTCGGATGTGAAAGCCCGGGGCTTAACCCCGGGTCTGCATTCGATACGGGCAGGCTAGAGTTCGGTAGGGGAGATCGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCGGTGGCGAAGGCGGATCTCTGGGCCGATACTGACGCTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGTTGGGAACTAGGTGTGGGCGACATTCCACGTCGTCCGTGCCGCAGCTAACGCATTAAGTTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGCGGAGCATGTGGCTTAATTCGACGCAACGCGAAGAACCTTACCAAGGCTTGACATACACCGGAAAACCCTGGAGACAGGG--TCCCCCTTGTGGTCGGTGTACAGGTGGTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTTCTGTGTTGCCAGCATGCCCTTCGGGGTGATGGGGACTCACAGGAGACTGCCGGGGTCAACTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTATGTCTTGGGCTGCACACGTGCTACAATGGCCGGTACAATGAGCTGCGATACCGCGAGGTGGAGCGAATCTCAAAAAGCCGGTCTCAGTTCGGATTGGGGTCTGCAACTCGACCCCATGAAGTCGGAGTTGCTAGTAATCGCAGATCAGCATTGCTGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACGTCACGAAAGTCGGTAACACCCGAAGCCGGTGGCCCAACCCCTTGTGGGAGGGAATCGTCGAA----------------------------------------------------

Tags: ofthe序列fastafilesequencegenestrain
1条回答
网友
1楼 · 发布于 2024-06-12 05:32:32

我发现了错误并重写了代码:

for i in range(0,totseq,randseq):
    #choose = i + random.randint(1,randseq-1) 
    choose = i+ random.randint(1,randseq) 
    for j in range(len(outrandseq)):                  #Test to see if the random sequence record number has already been chosen
      if choose == outrandseq[j]:
        choose = random.randint(1,totseq-1)           #Choose a new random sequence record number if the current one has already been chosen
    outrandseq.append(choose)
    print choose
    outseq = seq[choose]
    outlist.append(outseq)              

相关问题 更多 >