根据以下文献,我尝试在Python中实现SAMPLS算法:
[1]布什,B。L.&;纳赫巴尔,R。B(1993). 样本距离偏最小二乘法:PLS针对许多变量进行优化,并应用于CoMFA。计算机辅助分子设计杂志,7(5),587-619
[2]谢里登,R。P.,Nachbar,R。B.&;布什,B。L(1994). 扩展趋势向量:趋势矩阵和基于样本的偏最小二乘法。计算机辅助分子设计杂志,8(3),323-340
这是一个更快的PLS算法。但我在网上找不到任何其他材料或工具
这是我的代码,与sklearn的PLS回归相比,结果似乎不正确:
class SAMPLS:
def __init__(self,n_components):
self.n = n_components
def fit(self,X,y):
assert X.shape[1] >= self.n
C = X.dot(X.T)
self.t = [0]
self.v = [0]
self.y = [0]
self.A = [0]
y = (y - np.mean(y))/np.std(y)+1e-4
A_fit = 0
for h in range(1,self.n+1):
s = C.dot(y)
s -= np.mean(s) #center s
u = y
if h > 1:
for g in range(1,h):
alpha = s.T.dot(self.t[g])/self.t[g].T.dot(self.t[g])
s = s - alpha*self.t[g]
u = u - alpha*self.v[g]
t = s
beta = (t.T.dot(y))/(t.T.dot(t))
#Save for subsequent fit and for prediction
self.t.append(t)
self.v.append(beta*u)
self.y.append(y)
#Update residual y; update and save fitted activity A_fit
y = y - beta*t
A_fit = A_fit + beta*t
self.A.append(A_fit)
T_tot = 0
for h in range(1,self.n+1):
T = 0
for i in range(len(X)):
T +=self.v[h][i] * X[i]
T_tot += T
self.T_tot = T_tot
def predict(self,X):
y_pred = []
for m in range(len(X)):
A_pred = 0
for k in range(X.shape[1]):
A_pred += self.T_tot[k]*X[m,k]
y_pred.append(A_pred)
return y_pred
目前没有回答
相关问题 更多 >
编程相关推荐