大型稀疏矩阵点积计算中的记忆误差

2024-04-19 21:48:04 发布

男 | 程序猿一只，喜欢编程写python代码。

假设以下情况：我得到了一个双模网络的邻接矩阵，其中一个维度表示一些项（post）和出现在每个项下的其他标记。现在我想把这个双模网络折叠起来，以得到一个单模的项对项关系网络，其中每个链接的值表示两个项的共享标记的数量。可通过简单的矩阵乘法实现，如下所示：

或在代码中：

from scipy.sparse import csr_matrix, save_npz, load_npz

# load matrix
tpm = csr_matrix(load_npz('tag_post_matrix.npz'))

# compute dot product
cn = tpm.transpose().dot(tpm)

# save result
save_npz('content_network_abs.npz', cn)

运行一段时间后会出现此错误：

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-27-10ff98c2505a> in <module>()
----> 1 cn = tpm.transpose().dot(tpm)
      2 save_npz(expand('content_network_abs.npz'), cn)
      3 

/opt/anaconda/lib/python3.7/site-packages/scipy/sparse/base.py in dot(self, other)
    359 
    360         """
--> 361         return self * other
    362 
    363     def power(self, n, dtype=None):

/opt/anaconda/lib/python3.7/site-packages/scipy/sparse/base.py in __mul__(self, other)
    477             if self.shape[1] != other.shape[0]:
    478                 raise ValueError('dimension mismatch')
--> 479             return self._mul_sparse_matrix(other)
    480 
    481         # If it's a list or whatever, treat it like a matrix

/opt/anaconda/lib/python3.7/site-packages/scipy/sparse/compressed.py in _mul_sparse_matrix(self, other)
    500                                     maxval=nnz)
    501         indptr = np.asarray(indptr, dtype=idx_dtype)
--> 502         indices = np.empty(nnz, dtype=idx_dtype)
    503         data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
    504 

MemoryError:

我在执行过程中监视RAM，没有任何异常的观察（我有足够的内存：~1TB）。你知道吗

初始矩阵有约24000000个非零项（非常稀疏），我希望得到的矩阵也非常稀疏。你知道吗

我对这个主题是否有一个普遍的误解，或者代码中是否有bug？你知道吗

提前谢谢！你知道吗

Tags： in self 网络 save load 矩阵 scipy cn

0条回答

目前没有回答

大型稀疏矩阵点积计算中的记忆误差

相关问题更多 >

编程相关推荐

热门问题

热门文章

大型稀疏矩阵点积计算中的记忆误差

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >