np.sqrt公司np.总和内存不足

2024-04-26 04:47:21 发布

男 | 程序猿一只，喜欢编程写python代码。

我有一个csv文件，其中包含大约3000个用户（行）的‭56124‭项（列）的用户评级。额定值是小于128的整数。我有这个功能：

def sparse_to_npz(file, npz):
  print("Reading " + file + " ...")
  data_items = pd.read_csv(file)

  # Create a new dataframe without the user ids.
  data_items = data_items.drop('u', 1)

  # As a first step we normalize the user vectors to unit vectors.

  # magnitude = sqrt(x2 + y2 + z2 + ...)
  magnitude = np.sqrt(np.square(data_items).sum(axis=1))

  # unitvector = (x / magnitude, y / magnitude, z / magnitude, ...)
  data_items = data_items.divide(magnitude, axis='index')
  del magnitude

  print("Saving to " + npz)
  data_sparse = sparse.csr_matrix(data_items)
  del data_items
  sparse.save_npz(npz, data_sparse)
  #np.save("columns", data_items.columns.values)

其中传递了两个文件：输入csv文件（稀疏，每个用户都有所有项目的评分），并应输出npz文件以节省内存。在使用pandas读取文件并将其存储在data_items之后，我们需要计算震级并将data_items除以它，最后保存npz文件。问题是我在计算mag的步骤中得到了错误。在内存为12gb的机器上使用np.sqrt(np.square(...。我该怎么做？你知道吗

Tags：文件 csv the to 用户 data np items

0条回答

目前没有回答

np.sqrt公司np.总和内存不足

相关问题更多 >

编程相关推荐

热门问题

热门文章

np.sqrt公司np.总和内存不足

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >