如何在Python中替换列表中的所有子序列?

2024-04-24 03:16:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我当前使用的代码:

""" Replace all occurrences of subsequence a with b in list l """ 
def replace_subsequence(l,a,b):
    for i in range(len(l)):
        if(l[i:i+len(a)] == a):
            l[i:i+len(a)] = b

示例:

^{pr2}$

有没有更有效和/或更优雅的方法来完成这项工作?在


Tags: of代码inforlenifdefwith
3条回答

它绝对不优雅,但我想知道是否转换为字符串并使用字符串替换如果您的数据像示例中那样简单,则性能会更好。。。在

def strx(l):
    return str(l).strip('[]')

def replace_substring(l, a, b):
    return strx(l).replace( strx(a), strx(b) ).split(', ')

为了提高效率,可以在搜索列表中的子列表时使用Boyer–Moore string search algorithm

代码(credits

def match(pattern, list):
    matches = []
    m = len(list)
    n = len(pattern)

    rightMostIndexes = preprocessForBadCharacterShift(pattern)

    alignedAt = 0
    while alignedAt + (n - 1) < m:

        for indexInPattern in xrange(n-1, -1, -1):
            indexInlist = alignedAt + indexInPattern
            x = list[indexInlist]
            y = pattern[indexInPattern]

            if indexInlist >= m:
                break

            if x != y:

                r = rightMostIndexes.get(x)

                if x not in rightMostIndexes:
                    alignedAt = indexInlist + 1

                else:
                    shift = indexInlist - (alignedAt + r)
                    alignedAt += (shift > 0 and shift or alignedAt + 1)

                break
            elif indexInPattern == 0:
                matches.append(alignedAt)
                alignedAt += 1


    return matches

def preprocessForBadCharacterShift(pattern):
    map = { }
    for i in xrange(len(pattern)-1, -1, -1):
        c = pattern[i]
        if c not in map:
            map[c] = i

    return map

if __name__ == "__main__":
    matches = match("ana", "bananas")
    for integer in matches:
        print "Match at:", integer
    print (matches == [1, 3] and "OK" or "Failed")

    matches = match([1, 2, 3], [0, 1, 2,3 , 4, 5, 6])
    for integer in matches:
        print "list Match at:", integer
    print (matches)

使用xrange是一个简单的改进,可以加快代码的速度。xrange返回一个生成器,因此对于长列表来说,性能的改进尤其明显。但即使你的测试代码很短,我还是得到了可观的增长。在

使用timeit:

replace_subsequence        0.337936162949, 100000 runs
replace_subsequence_xrange 0.275990962982, 100000 runs

另外,您应该在循环之外为len(a)分配一个变量,这样就不会一直调用len()函数。这也将带来显著的加速。在

相关问题 更多 >