声明说:
修改上述程序,使给定GGCCTTGCCATTGG模式,前一个文件的前10行中的每一行都指示:
·找到与该行更相似的子字符串的编辑距离。在
·找到编辑最小距离的那一行的子字符串
以上程序如下:
import time
def levenshtein_distance (first, second):
if len(first) > len(second):
first, second = second, first
if len(second) == 0:
return len(fist)
first_length = len(first) + 1
second_length = len(second) + 1
distance_matrix = [[0]*second_length for x in range(first_length)]
for i in range(first_length): distance_matrix[i][0] = i
for j in range(second_length): distance_matrix[0][j] = j
for i in xrange(1, first_length):
for j in range(1, second_length):
deletion = distance_matrix[i-1][j] + 1
insertion = distance_matrix[i][j-1] + 2
substitution = distance_matrix[i-1][j-1] + 1
if first[i-1] != second[j-1]:
substitution += 1
distance_matrix[i][j] = min(insertion, deletion, substitution)
return distance_matrix[first_length-1][second_length-1]
def dna(patro):
t1 = time.clock()
f = open("HUMAN-DNA.txt")
text = f.readlines()
f.close()
distanciaMin = 100000000
distanciaPosicion = 0
distanciaLinea = 0
distanciaSubstring = ""
numeroLinea = 0
for line in text:
numeroLinea = numeroLinea + 1
for i in range(len(line)-len(patro)):
cadena = line[i:i+len(patro)]
distancia = levenshtein_distance(cadena, patro)
if distancia < distanciaMin:
distanciaMin = distancia
distanciaPosicion = 1
distanciaLinea = numeroLinea
distanciaSubstring = cadena
t2 = time.clock()
现在我把新的模式
^{pr2}$我有编辑的距离,是距离,我不确定距离的结果,也就是那一行的子串(陈述的第二点),我的问题是,我如何计算文本中的前十行?在
文件的一部分是:
CCCATCTCTTTCTCATTCCTTGGTTGAGAACACGAACTTCAGGACTTGCCTCACACTAGGGCCCATTCTT
TGTTTCCCAGAAAGAAGAGGCTCTCCACACAGAGTCCCATGTACACCAGGCTGTCAACAAACATGAATTG
AATGAAGGAGTGGATGGTTGGGTGGAAGTGATTTAAGAAATCCTAACTGGGGAATTTCACTGGAAACTTA
GGAAATTCAATTTATATAAAGTCTATGAATCGTCCATTTTTGTGTCCGCACATTCAAATGCTGTAGCTAA
TTTCCTGCTAAACAGTAGAAATTCAGTAAGTGTTCATGTTGAAAGGATGAAATTTGAGTGCTCTTGCATC
CTCAAAGAACTCTAGTAAAATAGAAATAAAGCTTTATTTGGAAGATTAAGTCATGAGCATAATTATGAGA
AGGCGGTCATTCTAATAATAGTGTCTTCACAAGTAGATGCTACATGCTGTGTAATATTTTGACTAAAAAA
AGTTCCTCTCAACATTTCTGAAGTGAGATAATGTACAACGATCCATGTTTTTAGCTACCTTGATAAGTTT
AGTGCATCCAGGGCTCCTTTCTTACCTGCTAACCGCCGAGTTTCAAATGCTAAGAAATTCTTCATTTCCT
AACACAAATATTCAATATAATTGCTGGTTGTTTGGGAGAAGAAAAATTTAGAATTCAGAAAGAAATACAG
AATGAAATGTTCTAATCAATCGAAAAAGGATTCTATAGACTTCGACGTTGTCTGGTTTACAAAGCAGTCT
我不明白你的全部问题。但我正在努力解决
How can i count the first ten lines in the text?
。你可以用filehandler.readlines文件处理程序(). 它将以列表的形式在内存中加载文件,其中每一行用新行字符分隔。 然后你可以从列表中读出10行。你可以试试这样的方法对于你的代码来说
^{pr2}$相关问题 更多 >
编程相关推荐