我正在用python实现一个稀疏矩阵分解例程来生成稀疏分数矩阵和密集特征矩阵。scikit learnscikitlearn.decomposition.DictionaryLearning
中有一个直接的实现,它适用于大小为1000x947的矩阵,在16GB内存的16核AMD系统上返回20s的结果。虽然它是快速和容易的工作,我不得不添加额外的约束问题,这就需要使用像CVXPY工具箱的东西。我正在使用CVXPY工具箱上的“SCS”解算器,但是用于最小化稀疏分数矩阵和密集特征矩阵的替代最小化公式甚至不会在我的计算机上运行,并且需要在HPC集群上运行几天。我错过了什么?你知道吗
#Scikitlearn implementation
estimator_RA = skd.dict_learning(data_RA, n_components=number_of_components_RA, alpha=alpha_value_RA, n_jobs=-1,
verbose=False, positive_dict=True, positive_code=True)
# CVPY implementation
MAX_ITERS = 10
residual = np.zeros(MAX_ITERS)
for iter_num in range(1, 1 + MAX_ITERS):
# At the beginning of an iteration, U and V are NumPy
# array types, NOT CVXPY variables.
# For odd iterations, treat U constant, optimize over V.
if iter_num % 2 == 1:
V_RA = cp.Variable((k_RA, n_RA))
constraint = [V_RA >= 0]
# constraint += [cp.norm2(V_RA, 1) <= np.ones((V_RA.shape[0],))]
print('Estimating V')
# For even iterations, treat V constant, optimize over U.
else:
U_RA = cp.Variable((m_RA, k_RA))
constraint = [U_RA >= 0]
print('Estimating U')
# Solve the problem.
# increase max iters otherwise, a few iterations are "OPTIMAL_INACCURATE"
# (eg a few of the entries in U or V are negative beyond standard tolerances)
obj = cp.Minimize(cp.norm(data_RA - U_RA * V_RA, 'fro') + alpha_value_RA * cp.pnorm(U_RA, 1))
prob = cp.Problem(obj, constraint)
prob.solve(solver='SCS', max_iters=1000, verbose=True, use_indirect=True)
if prob.status != cp.OPTIMAL:
raise Exception("Solver did not converge!")
print('Iteration {}, residual norm {}'.format(iter_num, prob.value))
residual[iter_num - 1] = prob.value
# Convert variable to NumPy array constant for next iteration.
if iter_num % 2 == 1:
V_RA = V_RA.value
else:
U_RA = U_RA.value
目前没有回答
相关问题 更多 >
编程相关推荐