有没有一种方法可以使用python进一步缩短稀疏的解决方案时间？

IJS=load('KbN1M.txt'); b=load('FbN1M.txt'); I=IJS(:,1); J=IJS(:,2); S=IJS(:,3); Neval=10; tsparse=zeros(Neval,1); tsolve_direct=zeros(Neval,1); tsolve_sparse=zeros(Neval,1); tsolve_pcg=zeros(Neval,1); for i=1:Neval tic A=sparse(I,J,S); tsparse(i)=toc; tic x=A\b; tsolve_direct(i)=toc; tic x2=pcg(A,b,1e-5,size(b,1)); tsolve_pcg(i)=toc; end save -ascii octave_n1M_tsparse.txt tsparse save -ascii octave_n1M_tsolvedirect.txt tsolve_direct save -ascii octave_n1M_tsolvepcg.txt tsolve_pcg

import time from scipy import sparse as sp from scipy.sparse import linalg import numpy as np from scikits.umfpack import spsolve, splu #NEEDS LINUX b=np.loadtxt('FbN1M.txt') triplets=np.loadtxt('KbN1M.txt') I=triplets[:,0]-1 J=triplets[:,1]-1 V=triplets[:,2] I=I.astype(int) J=J.astype(int) NN=int(b.shape[0]) Neval=10 time_sparse=np.zeros((Neval,1)) time_direct=np.zeros((Neval,1)) time_conj=np.zeros((Neval,1)) time_umfpack=np.zeros((Neval,1)) for i in range(Neval): t = time.time() A=sp.coo_matrix((V, (I, J)), shape=(NN, NN)) A=sp.csr_matrix(A) time_sparse[i,0]=time.time()-t t = time.time() x=linalg.spsolve(A, b) time_direct[i,0] = time.time() - t t = time.time() x2=sp.linalg.cg(A, b, x0=None, tol=1e-05) time_conj[i,0] = time.time() - t t = time.time() x3 = spsolve(A, b) #ONLY IN LINUX time_umfpack[i,0] = time.time() - t np.savetxt('pythonlinux_n1M_tsparse.txt',time_sparse,fmt='%.18f') np.savetxt('pythonlinux_n1M_tsolvedirect.txt',time_direct,fmt='%.18f') np.savetxt('pythonlinux_n1M_tsolvepcg.txt',time_conj,fmt='%.18f') np.savetxt('pythonlinux_n1M_tsolveumfpack.txt',time_umfpack,fmt='%.18f')

1条回答

网友

1楼 · 发布于 2024-05-23 21:17:08

我会尽力回答自己的问题。为了给出答案，我尝试了一个更为苛刻的例子，使用一个大小为（N，N）的矩阵，大约50万乘50万，以及相应的向量（N，1）。然而，这比问题中提供的要稀疏得多（更密集）。这个存储在ascii中的矩阵约为1.7GB，而示例中的矩阵约为0.25GB（尽管其“大小”更大）。看看它的形状

然后，我再次尝试使用Matlab、Octave和Python，使用前面提到的scipy的直接解算器、intel MKL包装器和Tim Davis的UMFPACK来解Ax=b。我的第一个惊喜是Matlab和Octave都可以使用A\b来求解系统，这不能确定它是否是直接解算器，因为它根据矩阵的特征选择了最佳解算器，请参见Matlab's x=A\b。然而，python的linalg.spsolve、MKL包装器和UMFPACK在Windows和Linux中抛出了内存不足错误。在mac中，linalg.spsolve以某种方式计算出一个解决方案，而且它的性能非常差，从来没有通过内存错误实现。我想知道内存的处理是否因操作系统而异。在我看来，mac似乎将内存交换到了硬盘，而不是从RAM中使用。与matlab相比，Python中CG解算器的性能相当差。然而，为了提高python中CG解算器的性能，如果首先计算a=0.5（a+a'）（如果显然有一个对称系统），则可以获得性能上的巨大改进。在Python中使用预处理程序没有帮助。我尝试使用sp.linalg.spilu方法和sp.linalg.LinearOperator来计算预条件器，但性能相当差。在matlab中，可以使用不完全Cholesky分解

对于内存不足问题，解决方案是使用LU分解并求解两个嵌套系统，例如Ax=b、A=LL'，y=L\b和x=y\L'

我把最小溶解时间放在这里

Matlab mac, A\b = 294 s.
Matlab mac, PCG (without conditioner)= 17.9 s.
Matlab mac, PCG (with incomplete Cholesky conditioner) = 9.8 s.
Scipy mac, direct = 4797 s.
Octave, A\b = 302 s.
Octave, PCG (without conditioner)= 28.6 s.
Octave, PCG (with incomplete Cholesky conditioner) = 11.4 s.
Scipy, PCG (without A=0.5(A+A'))= 119 s.
Scipy, PCG (with A=0.5(A+A'))= 12.7 s.
Scipy, LU decomposition using UMFPACK (Linux) = 3.7 s total.

所以答案是肯定的，有很多方法可以提高scipy的求解时间。如果工作站内存允许，强烈建议使用UMFPACK（Linux）或“英特尔MKL QR解算器”的包装器。否则，如果要处理对称系统，在使用共轭梯度解算器之前执行A=0.5（A+A'）可以对解决方案性能产生积极影响。如果有人对这个新系统感兴趣，请告诉我，这样我就可以上传它了

相关问题更多 >

编程相关推荐

热门问题

热门文章