Python中的优化点乘

Question

两个n维向量 u=[u1,u2,...un] 和 v=[v1,v2,...,vn] 的点积计算方式是 u1*v1 + u2*v2 + ... + un*vn。

昨天有个问题被提出来，让我想找出在Python中用标准库计算点积的最快方法，不使用任何第三方模块或C/Fortran/C++的调用。

我测试了四种不同的方法；到目前为止，最快的似乎是 sum(starmap(mul,izip(v1,v2)))（其中 starmap 和 izip 是来自 itertools 模块的）。

对于下面展示的代码，这里是运行一百万次所花费的时间（单位：秒）：

d0: 12.01215
d1: 11.76151
d2: 12.54092
d3: 09.58523

你能想到更快的方法吗？

import timeit # module with timing subroutines                                                              
import random # module to generate random numnbers                                                          
from itertools import imap,starmap,izip
from operator import mul

def v(N=50,min=-10,max=10):
    """Generates a random vector (in an array) of dimension N; the                                          
    values are integers in the range [min,max]."""
    out = []
    for k in range(N):
        out.append(random.randint(min,max))
    return out

def check(v1,v2):
    if len(v1)!=len(v2):
        raise ValueError,"the lenght of both arrays must be the same"
    pass

def d0(v1,v2):
    """                                                                                                     
    d0 is Nominal approach:                                                                                 
    multiply/add in a loop                                                                                  
    """
    check(v1,v2)
    out = 0
    for k in range(len(v1)):
        out += v1[k] * v2[k]
    return out

def d1(v1,v2):
    """                                                                                                     
    d1 uses an imap (from itertools)                                                                        
    """
    check(v1,v2)
    return sum(imap(mul,v1,v2))

def d2(v1,v2):
    """                                                                                                     
    d2 uses a conventional map                                                                              
    """
    check(v1,v2)
    return sum(map(mul,v1,v2))

def d3(v1,v2):
    """                                                                                                     
    d3 uses a starmap (itertools) to apply the mul operator on an izipped (v1,v2)                           
    """
    check(v1,v2)
    return sum(starmap(mul,izip(v1,v2)))

# generate the test vectors                                                                                 
v1 = v()
v2 = v()

if __name__ == '__main__':

    # Generate two test vectors of dimension N                                                              

    t0 = timeit.Timer("d0(v1,v2)","from dot_product import d0,v1,v2")
    t1 = timeit.Timer("d1(v1,v2)","from dot_product import d1,v1,v2")
    t2 = timeit.Timer("d2(v1,v2)","from dot_product import d2,v1,v2")
    t3 = timeit.Timer("d3(v1,v2)","from dot_product import d3,v1,v2")

    print "d0 elapsed: ", t0.timeit()
    print "d1 elapsed: ", t1.timeit()
    print "d2 elapsed: ", t2.timeit()
    print "d3 elapsed: ", t3.timeit()

注意，文件名必须是 dot_product.py，这样脚本才能运行；我在Mac OS X 10.5.8上使用的是Python 2.5.1。

编辑：

我对N=1000运行了这个脚本，结果是（单位：秒，运行一百万次）：

d0: 205.35457
d1: 208.13006
d2: 230.07463
d3: 155.29670

我想可以安全地假设，确实，第三种方法是最快的，而第二种方法是四种中最慢的。

性能优化编程技巧标准库数值计算算法效率计算方法点积向量运算

Python中的优化点乘

5 个回答

撰写回答