库itertools与Python代码的性能比较

5 投票

2 回答

7610 浏览

提问于 2025-04-17 19:30

作为我提问的回答，链接是关于如何找到两个列表相同的第一个位置。我得到了一个建议，使用C语言库中的itertools来加快速度。

为了验证这个建议，我用cProfile写了一个测试：

from itertools import takewhile, izip

def match_iter(self, other):
    return sum(1 for x in takewhile(lambda x: x[0] == x[1],
                                        izip(self, other)))

def match_loop(self, other):
    element = -1
    for element in range(min(len(self), len(other))):
        if self[element] != other[element]:
            element -= 1
            break
    return element +1

def test():
    a = [0, 1, 2, 3, 4]
    b = [0, 1, 2, 3, 4, 0]

    print("match_loop a=%s, b=%s, result=%s" % (a, b, match_loop(a, b)))
    print("match_iter a=%s, b=%s, result=%s" % (a, b, match_iter(a, b)))

    i = 10000
    while i > 0:
        i -= 1
        match_loop(a, b)
        match_iter(a, b)

def profile_test():
    import cProfile
    cProfile.run('test()')

if __name__ == '__main__':
    profile_test()

match_iter()这个函数使用了itertools，而match_loop()是我之前用普通Python实现的函数。

test()这个函数定义了两个列表，并打印出这两个函数的结果，以确认它们的工作情况。两个结果都是5，这个值是因为两个列表相等的长度。然后，它会对这两个函数循环执行10,000次。

最后，整个过程通过profile_test()进行性能分析。

我了解到，izip在Python3的itertools中并没有实现，至少在我使用的debian wheezy中是这样。所以我用python2.7进行了测试。

以下是结果：

python2.7 match_test.py
match_loop a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5
match_iter a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5
         180021 function calls in 0.636 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.636    0.636 <string>:1(<module>)
        1    0.039    0.039    0.636    0.636 match_test.py:15(test)
    10001    0.048    0.000    0.434    0.000 match_test.py:3(match_iter)
    60006    0.188    0.000    0.275    0.000 match_test.py:4(<genexpr>)
    50005    0.087    0.000    0.087    0.000 match_test.py:4(<lambda>)
    10001    0.099    0.000    0.162    0.000 match_test.py:7(match_loop)
    20002    0.028    0.000    0.028    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    10001    0.018    0.000    0.018    0.000 {min}
    10001    0.018    0.000    0.018    0.000 {range}
    10001    0.111    0.000    0.387    0.000 {sum}

让我感到奇怪的是，从cumtime的值来看，我的普通Python版本在10,000次循环中只用了0.162秒，而match_iter版本却用了0.434秒。

一方面，Python的速度非常快，这很好，所以我不需要担心。但这是否正确，C语言库的执行时间比简单的Python代码还要长两倍以上？还是我犯了什么致命的错误？

为了验证，我还用python2.6进行了测试，似乎速度更快，但循环和itertools之间的差异还是一样。

有没有经验丰富的人愿意帮忙？

代码优化 c语言性能分析 itertools cprofile 列表比较速度测试 python2.7

2 个回答

首先，给你点个赞，居然真的去测量了时间。
其次，代码的可读性通常比运行速度更重要。如果你的代码快了3倍，但你每3周有2周都在调试它，那就不值得你花时间了。
第三，你也可以用 timeit 来测量小段代码的执行时间。我觉得这个方法比用 profile 要简单一些。（不过 profile 对于找出性能瓶颈还是挺有用的）。

itertools 通常来说是比较快的。然而，尤其在这个例子中，你的 takewhile 会让速度变慢，因为 itertools 需要对每一个元素调用一个函数。在 Python 中，每次函数调用都会有一些额外的开销，这可能会让你变慢一点（还有创建 lambda 函数本身的开销）。注意，使用生成器表达式的 sum 也会增加一些开销。不过，最终看来，在这种情况下，基本的循环总是表现得更好。

from itertools import takewhile, izip

def match_iter(self, other):
    return sum(1 for x in takewhile(lambda x: x[0] == x[1],
                                        izip(self, other)))

def match_loop(self, other):
    cmp = lambda x1,x2: x1 == x2

    for element in range(min(len(self), len(other))):
        if self[element] == other[element]:
            element += 1
        else:
            break

    return element

def match_loop_lambda(self, other):
    cmp = lambda x1,x2: x1 == x2

    for element in range(min(len(self), len(other))):
        if cmp(self[element],other[element]):
            element += 1
        else:
            break

    return element

def match_iter_nosum(self,other):
    element = 0
    for _ in takewhile(lambda x: x[0] == x[1],
                       izip(self, other)):
        element += 1
    return element

def match_iter_izip(self,other):
    element = 0
    for x1,x2 in izip(self,other):
        if x1 == x2:
            element += 1
        else:
            break
    return element



a = [0, 1, 2, 3, 4]
b = [0, 1, 2, 3, 4, 0]

import timeit

print timeit.timeit('match_iter(a,b)','from __main__ import a,b,match_iter')
print timeit.timeit('match_loop(a,b)','from __main__ import a,b,match_loop')
print timeit.timeit('match_loop_lambda(a,b)','from __main__ import a,b,match_loop_lambda')
print timeit.timeit('match_iter_nosum(a,b)','from __main__ import a,b,match_iter_nosum')
print timeit.timeit('match_iter_izip(a,b)','from __main__ import a,b,match_iter_izip')

不过要注意，最快的版本其实是循环和 itertools 的结合。这种（显式的）对 izip 的循环在我看来也更容易阅读。所以，我们可以得出结论，takewhile 是比较慢的部分，而不一定是 itertools 本身的问题。

回答于 2025-04-17 由 Python大师

分享举报

我想这里的问题是你的测试列表太小了——这意味着任何差异都可能很小，而创建迭代器的成本可能超过了它们带来的好处。

在更大的测试中（性能更可能重要的地方），使用 sum() 的版本可能会比其他版本表现得更好。

还有一个风格的问题——手动版本更长，并且依赖于通过索引来迭代，这让它的灵活性降低了。

我认为最易读的解决方案应该像这样：

def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield this

def match(seq, other):
    return sum(1 for _ in while_equal(seq, other))

有趣的是，在我的系统上，这个稍微修改过的版本：

def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield 1

def match(seq, other):
    return sum(while_equal(seq, other))

比纯循环版本的表现更好：

a = [0, 1, 2, 3, 4]
b = [0, 1, 2, 3, 4, 0]

import timeit

print(timeit.timeit('match_loop(a,b)', 'from __main__ import a, b, match_loop'))
print(timeit.timeit('match(a,b)', 'from __main__ import match, a, b'))

结果是：

1.3171300539979711
1.291257290984504

话虽如此，如果我们把纯循环版本改得更符合Python的风格：

def match_loop(seq, other):
    count = 0
    for this, that in zip(seq, other):
        if this != that:
            return count
        count += 1
    return count

在我这里，这个版本的运行时间是 0.8548871780512854，明显比其他任何方法都快，同时也保持了可读性。这可能是因为原始版本通过索引循环，而这种方式通常非常慢。不过，我还是会选择这篇文章中的第一个版本，因为我觉得它最易读。

回答于 2025-04-17 由 Python大师

分享举报

库itertools与Python代码的性能比较

2 个回答

撰写回答