为什么用于字典学习的scikitlearn实现比cvxpy更快?

2024-04-23 20:55:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在用python实现一个稀疏矩阵分解例程来生成稀疏分数矩阵和密集特征矩阵。scikit learnscikitlearn.decomposition.DictionaryLearning中有一个直接的实现,它适用于大小为1000x947的矩阵,在16GB内存的16核AMD系统上返回20s的结果。虽然它是快速和容易的工作,我不得不添加额外的约束问题,这就需要使用像CVXPY工具箱的东西。我正在使用CVXPY工具箱上的“SCS”解算器,但是用于最小化稀疏分数矩阵和密集特征矩阵的替代最小化公式甚至不会在我的计算机上运行,并且需要在HPC集群上运行几天。我错过了什么?你知道吗


#Scikitlearn implementation
       estimator_RA = skd.dict_learning(data_RA, n_components=number_of_components_RA, alpha=alpha_value_RA, n_jobs=-1,
                                     verbose=False, positive_dict=True, positive_code=True)


# CVPY implementation
    MAX_ITERS = 10
    residual = np.zeros(MAX_ITERS)
    for iter_num in range(1, 1 + MAX_ITERS):
        # At the beginning of an iteration, U and V are NumPy
        # array types, NOT CVXPY variables.

        # For odd iterations, treat U constant, optimize over V.
        if iter_num % 2 == 1:
            V_RA = cp.Variable((k_RA, n_RA))
            constraint = [V_RA >= 0]
            # constraint += [cp.norm2(V_RA, 1) <= np.ones((V_RA.shape[0],))]

            print('Estimating V')
        # For even iterations, treat V constant, optimize over U.
        else:
            U_RA = cp.Variable((m_RA, k_RA))
            constraint = [U_RA >= 0]

            print('Estimating U')

        # Solve the problem.
        # increase max iters otherwise, a few iterations are "OPTIMAL_INACCURATE"
        # (eg a few of the entries in U or V are negative beyond standard tolerances)
        obj = cp.Minimize(cp.norm(data_RA - U_RA * V_RA, 'fro') + alpha_value_RA * cp.pnorm(U_RA, 1))


        prob = cp.Problem(obj, constraint)
        prob.solve(solver='SCS', max_iters=1000, verbose=True, use_indirect=True)

        if prob.status != cp.OPTIMAL:
            raise Exception("Solver did not converge!")

        print('Iteration {}, residual norm {}'.format(iter_num, prob.value))
        residual[iter_num - 1] = prob.value

        # Convert variable to NumPy array constant for next iteration.
        if iter_num % 2 == 1:
            V_RA = V_RA.value
        else:
            U_RA = U_RA.value


Tags: ofalphatruevalue矩阵cpnummax