Python:re..查找最长的sequen

网友

1楼 · 编辑于 2024-04-20 00:31:57

import re
pat = re.compile("[^|]+")
p = "diol diNCO diamine diNCO diamine diNCO diamine diNCO diol diNCO diamine".replace("diNCO diamine","|").replace(" ","")
print max(map(len,pat.split(p)))

网友

2楼 · 编辑于 2024-04-20 00:31:57

在Ealdwulf的answer上展开：

有关re.findall的文档可以在here中找到。

def getLongestSequenceSize(search_str, polymer_str):
    matches = re.findall(r'(?:\b%s\b\s?)+' % search_str, polymer_str)
    longest_match = max(matches)
    return longest_match.count(search_str)

这可以写成一行，但在这种形式下可读性会降低。

备选方案：

如果polymer_str很大，那么使用re.finditer会更节省内存。你可以这样做：

def getLongestSequenceSize(search_str, polymer_str):
    longest_match = ''
    for match in re.finditer(r'(?:\b%s\b\s?)+' % search_str, polymer_str):
        if len(match.group(0)) > len(longest_match):
            longest_match = match.group(0)
    return longest_match.count(search_str)

findall和finditer之间的最大区别是第一个返回列表对象，而第二个遍历匹配对象。而且，finditer方法会慢一些。

网友

3楼 · 编辑于 2024-04-20 00:31:57

我认为op需要最长的连续序列。您可以获得所有连续序列，如： seqs=re.findall（“（？）？：diNCO二胺+”，聚合物）

然后找出最长的。

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python:re..查找最长的sequen

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >