Eigen矩阵与Numpy数组乘法性能对比

10 投票

3 回答

11893 浏览

提问于 2025-04-18 12:05

我在这个问题中看到，eigen的性能非常好。不过，我试着比较了一下eigen的MatrixXi乘法速度和numpy的array乘法速度，结果发现numpy的表现更好（大约26秒对比29秒）。请问有没有更高效的方法来使用eigen呢？

这是我的代码：

使用Numpy的部分：

import numpy as np
import time

n_a_rows = 4000
n_a_cols = 3000
n_b_rows = n_a_cols
n_b_cols = 200

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)

start = time.time()
d = np.dot(a, b)
end = time.time()

print "time taken : {}".format(end - start)

结果：

time taken : 25.9291000366

使用Eigen的部分：

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main()
{

  int n_a_rows = 4000;
  int n_a_cols = 3000;
  int n_b_rows = n_a_cols;
  int n_b_cols = 200;

  MatrixXi a(n_a_rows, n_a_cols);

  for (int i = 0; i < n_a_rows; ++ i)
      for (int j = 0; j < n_a_cols; ++ j)
        a (i, j) = n_a_cols * i + j;

  MatrixXi b (n_b_rows, n_b_cols);
  for (int i = 0; i < n_b_rows; ++ i)
      for (int j = 0; j < n_b_cols; ++ j)
        b (i, j) = n_b_cols * i + j;

  MatrixXi d (n_a_rows, n_b_cols);

  clock_t begin = clock();

  d = a * b;

  clock_t end = clock();
  double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
  std::cout << "Time taken : " << elapsed_secs << std::endl;

}

结果：

Time taken : 29.05

我使用的是numpy 1.8.1和eigen 3.2.0-4。

数值计算矩阵运算线性代数计算优化 C++库性能对比数组乘法

3 个回答

有没有更高效的方法来处理这个eigen？

当你在做矩阵相乘时，如果等号左边的矩阵在右边没有出现，你可以放心地告诉编译器，这里没有“别名”问题。简单来说，就是你可以省去一个不必要的临时变量和赋值操作。对于大矩阵来说，这样做能显著提高性能。你可以通过使用 .noalias() 函数来实现，方法如下：

d.noalias() = a * b;

这样 a*b 就会直接计算并存储到 d 中。否则，为了避免别名问题，编译器会先把结果存到一个临时变量里，然后再把这个临时变量的值赋给你的目标矩阵 d。

所以，在你的代码中，这一行：

d = a * b;

实际上是被编译成这样：

temp = a*b;
d = temp;

回答于 2025-04-18 由 Python大师

分享举报

将以下内容更改为：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)

变成：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)*1.0
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)*1.0

在我的笔记本电脑上，这样做至少能让速度提升100倍：

time taken : 11.1231250763

对比：

time taken : 0.124922037125

除非你真的想要进行整数的乘法。在Eigen库中，乘法双精度数字的速度也更快（这相当于把MatrixXi替换成MatrixXd三次），但我看到的提升只有1.5倍：所用时间：0.555005秒对比0.846788秒。

回答于 2025-04-18 由 Python大师

分享举报

我的问题在评论中得到了@Jitse Niesen和@ggael的回答。

我需要添加一个标志来开启编译时的优化：-O2 -DNDEBUG（这里的O是大写字母o，不是数字零）。

加上这个标志后，eigen的代码运行时间从没有这个标志的~29秒缩短到了0.6秒。

回答于 2025-04-18 由 Python大师

分享举报

Eigen矩阵与Numpy数组乘法性能对比

3 个回答

撰写回答