如何用Python代码实现采样距离偏最小二乘(SAMPLS)算法?

2024-05-16 02:53:28 发布

您现在位置:Python中文网/ 问答频道 /正文

根据以下文献,我尝试在Python中实现SAMPLS算法:

[1]布什,B。L.&;纳赫巴尔,R。B(1993). 样本距离偏最小二乘法:PLS针对许多变量进行优化,并应用于CoMFA。计算机辅助分子设计杂志,7(5),587-619

[2]谢里登,R。P.,Nachbar,R。B.&;布什,B。L(1994). 扩展趋势向量:趋势矩阵和基于样本的偏最小二乘法。计算机辅助分子设计杂志,8(3),323-340

这是一个更快的PLS算法。但我在网上找不到任何其他材料或工具

这是我的代码,与sklearn的PLS回归相比,结果似乎不正确:

class SAMPLS:
    def __init__(self,n_components):
        self.n = n_components
    def fit(self,X,y):
        assert X.shape[1] >= self.n
        C = X.dot(X.T) 
        self.t = [0]
        self.v = [0]
        self.y = [0]
        self.A = [0]
        y = (y - np.mean(y))/np.std(y)+1e-4

        A_fit = 0
        for h in range(1,self.n+1):
            s = C.dot(y)
            s -= np.mean(s) #center s
            u = y
            if h > 1:
                for g in range(1,h):
                    alpha = s.T.dot(self.t[g])/self.t[g].T.dot(self.t[g])
                    s = s - alpha*self.t[g]
                    u = u - alpha*self.v[g]

            t = s
            beta = (t.T.dot(y))/(t.T.dot(t))

            #Save for subsequent fit and for prediction
            self.t.append(t)
            self.v.append(beta*u)
            self.y.append(y)

            #Update residual y; update and save fitted activity A_fit
            y = y - beta*t
            A_fit = A_fit + beta*t
            self.A.append(A_fit)

        T_tot = 0
        for h in range(1,self.n+1):
            T = 0
            for i in range(len(X)):
                T +=self.v[h][i] * X[i]
            T_tot += T
        self.T_tot = T_tot

    def predict(self,X):
        y_pred = []
        for m in range(len(X)):
            A_pred = 0
            for k in range(X.shape[1]):
                A_pred += self.T_tot[k]*X[m,k]
            y_pred.append(A_pred)
        return y_pred

Tags: inselfalphafordefnprangedot