我想迭代kmers列表,选择只包含字符A、T、G和C的项
kmers=["AL","AT","GC","AA","AP"]
for kmer in kmers:
for letter in kmer:
if letter not in ["A","T","G","C"]:
pass
else:
DNA_kmers.append(kmer)
print("DNA_kmers",DNA_kmers)
输出:
DNA_kmers ['AL', 'AT', 'AT', 'GC', 'GC', 'AA', 'AA', 'AP']
期望输出:
DNA_kmers=["AT","GC","AA"]
我知道的唯一方法是
if "B" in kmer or "D" in kmer or "E" in kmer or "F" in kmer or "H" in kmer or "I" in kmer or "J" in kmer or "K" in kmer or "L" in kmer or "M" in kmer or "N" in kmer or "O" in kmer or "P" in kmer or "Q" in kmer or "R" in kmer or "S" in kmer or "U" in kmer or "V" in kmer or "W" in kmer or "X" in kmer or "Y" in kmer or "Z" in kmer:
pass
您的代码当前将添加任何字符匹配的项目。我们可以将其调整为仅添加两个字符匹配的项目:
如果您不熟悉Python,我已经在
for
循环中使用了else
子句。这不是所有语言都可用的。当且仅当循环完成所有迭代时,else
块才会运行有非常简单的方法来做你想做的事情。例如,以下内容将使用嵌套列表完成作业:
一个性能更好的通用解决方案是使用正则表达式:
如果我们将问题仅限于k=2的k-mers,我们可以进一步优化性能。如果匹配固定长度的字符串,例如使用
[AGCT]{2}
,则正则表达式的性能应该略有提高。我们还可以使用product
创建一个用于恒定时间查找的集合:相关问题 更多 >
编程相关推荐