编译单独的文件

2024-05-16 23:45:09 发布

您现在位置:Python中文网/ 问答频道 /正文

如果有三个文件:

File1
    >TAIR:175_a
     ALSKDJFLKAHGLKASJDFLAKJSDLKGHALKSDHGALKALKSJDF
    >TAIR:175_b
     ZZZLAALSKDJFALKSDJFL;KJEIURALKDJFNVALKSDJFKZZZ
    >TAIR:175_c
     ALSKDJFLKAHGLKASJDFLAKJSDLKGHALKSDHGALKALKSJDF

File2
    >TAIR:674_a
     ASLALKSDGHLA;KSJDFIEURALKSDHGLANVALKSDJGHKLJA
    >TAIR:674_b
     ASLALKSDGHDJGDGSDDFIEURALKSDHGLANVALKSDJGHKLJA

File3
    >TAIR:812_a
     KLJALSKDHGLAKSDHJFIEUROWASDLKGNIEASDFJKWERLJKJ
    >TAIR:812_c
     ASLALKSDGHLA;KSJDFIEURALKSDHGLANVALKSDJGHKLJA

File4
    >TAIR:975_b
     KLJALSKDHGLAKSDHJFIEUROWASDLKGNIEASDFJKWERLJKJ

File5
    >TAIR:444_b
     QQALKSDJFWOIAOQIWUERTOIUQTOIUOQIWEURLASKDJFA
    >TAIR:444_c
     QQALKSDJFWOIAOQIWUERTOIUQTOIUOQIWEURLASKDJFA

我编写此代码是为了提取目录中所有序列的名称:

#!/usr/bin/env python
from Bio import SeqIO
filenames = ["file1","file2","file3"]
ids = []

for record in filenames:
    f = SeqIO.parse(record, 'fasta')
    ids.append(f.id)

print ids

结果是:

 python search_list.py 
[<generator object parse at 0x7f32836018c0>, <generator object parse at 0x7f3283601910>, <generator object parse at 0x7f3283601960>]

我期望的结果是:

file_a
    >TAIR:175_a
     ALSKDJFLKAHGLKASJDFLAKJSDLKGHALKSDHGALKALKSJDF
    >TAIR:674_a
     ASLALKSDGHLA;KSJDFIEURALKSDHGLANVALKSDJGHKLJA

file_b
    >TAIR:175_b
     ZZZLAALSKDJFALKSDJFL;KJEIURALKDJFNVALKSDJFKZZZ
    >TAIR:674_b
     ASLALKSDGHDJGDGSDDFIEURALKSDHGLANVALKSDJGHKLJA
    >TAIR:975_b
     KLJALSKDHGLAKSDHJFIEUROWASDLKGNIEASDFJKWERLJKJ
    >TAIR:444_b
     QQALKSDJFWOIAOQIWUERTOIUQTOIUOQIWEURLASKDJFA

file_c
    >TAIR:175_c
     ALSKDJFLKAHGLKASJDFLAKJSDLKGHALKSDHGALKALKSJDF
    >TAIR:812_c
     ASLALKSDGHLA;KSJDFIEURALKSDHGLANVALKSDJGHKLJA
    >TAIR:444_c
     QQALKSDJFWOIAOQIWUERTOIUQTOIUOQIWEURLASKDJFA

有什么建议可以解决这个问题吗?打开“id”列表中的文件并编译它们?你知道吗


Tags: 文件idsobjectparsegeneratoratfiletair
2条回答

您得到这个输出是因为您要求python打印一个对象,所以它只是在默认情况下打印内存地址,而不是内容。 您最好只使用标准的python open方法(遍历要检查的文件列表)。然后可以遍历文件中的每一行,并将其添加到列表或任何您喜欢的内容中。让我知道一个例子是否有用。你知道吗

(忽略了打印括号的问题,)您的代码在我的系统(Python 3.6.0;Biopython 1.69)上出现以下情况:

AttributeError: 'generator' object has no attribute 'id'

asSeqIO.parse()返回一个生成器。你的“我期望的结果”也是完全错误的。鉴于此代码,您所期望的是:

['TAIR:175_a', 'TAIR:674_a', 'TAIR:812_a', 'TAIR:975_b', 'TAIR:175_b', 'TAIR:444_b', 'TAIR:175_c', 'TAIR:444_c']

在我的环境中,以下代码将为您提供:

from Bio import SeqIO

filenames = ["file1.fasta", "file2.fasta", "file3.fasta"]

ids = []

for filename in filenames:
    records = SeqIO.parse(filename, 'fasta')

    for record in records:
        ids.append(record.id)

print(ids)

相关问题 更多 >