SeqIO: “在句柄中未找到记录”
我刚开始学习Python和BioPython,编程经验不多。希望大家能给我一些帮助。
我想从genbank中提取CDS和/或rRNA序列。因为我只想获取开放阅读框,所以我不想直接提取整个序列。当我运行下面的代码时,出现了一个错误,提示:
在处理过程中没有找到记录
这个错误出现在这行代码:record = SeqIO.read(handle, "genbank")
。我不太确定该怎么解决这个问题。下面是我使用的代码。
另外,如果有更简单的方法或者已经发布的代码,麻烦告诉我一下。
谢谢!
# search sequences by a combination of keywords
# need to find (number of) results to set 'retmax' value
handle = Entrez.esearch(db = searchdb, term = searchterm)
records = Entrez.read(handle)
handle.close()
# repeat search with appropriate 'retmax' value
all_handle = Entrez.esearch(db = searchdb, term = searchterm, retmax = records['Count'])
records = Entrez.read(all_handle)
print " "
print "Number of sequences found:", records['Count'] #printing to make sure that code is working thus far.
print " "
locations = [] # store locations of target sequences
sequences = [] # store target sequences
for i in range(0,int(records['Count'])) :
handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "xml")
record = SeqIO.read(handle, "genbank")
for feature in record.features:
if feature.type==searchfeaturetype: #searches features for proper feature type
if searchgeneproduct in feature.qualifiers['product'][0]: #searches features for proper gene product
if str(feature.qualifiers) not in locations: # no repeat location entries
locations.append(str(feature.location)) # appends location entry
sequences.append(feature.extract(record.seq)) # append sequence
1 个回答
1
你在从genbank请求xml
格式的数据,但SeqIO.read
这个函数其实是期待你提供genbank的平面文件格式。试着把你的efetch
这一行改成这样:
handle = Entrez.efetch(db = searchdb, id = records['IdList'][i], rettype = "gb", retmode = "txt")