我已经生成了一个经过编辑的DNA测序文件,它在不同的行上有单独的读取。并希望消除那些在另一行的一个字符内匹配的字符。你知道吗
输入文件:
AAAAAAAAAAAA #Start checking at line 1
TTTTTTTTTTTT #Diff by >1 char: Keep
AAAAACAAAAAA #Diff by 1 char: Delete
AAAAACAAACAA #Diff by 2 char: Keep
AAAAAAAAAAAA #Diff by <1 char: Delete
输出文件:
AAAAAAAAAAAA
TTTTTTTTTTTT
AAAAACAAACAA
到目前为止我所拥有的:
with open(current_file, 'r') as f:
lineCharsList = []
outLines = []
for line in f:
lineChars = line[:]
if not (lineChars in lineCharsList): #exactly matches lines, need partial matching
lineCharsList.append(lineChars)
outLines.append(line)
print line
你已经有一个很好的答案了。你知道吗
下面是我在basic python中的实现
pip install python-levenshtein
并使用函数^{代码是:
输出:
相关问题 更多 >
编程相关推荐