Helo,我在学习机器学习第一原理,所以我用numpy和微积分从头开始用back-prop编码逻辑回归。用加权平均(动量)更新衍生工具对我有效,但对RMSProp或Adam无效,因为成本不会下降。我做错事了
亚当的主要障碍是这个
# momentum
VW = beta*VW + (1-beta)*dW
Vb = beta*Vb + (1-beta)*db
# rmsprop
SW = beta2*SW + (1-beta2)*dW**2
Sb = beta2*Sb + (1-beta2)*db**2
# update weight TODO: Adam doesntwork>
W -= learning_rate*VW/(np.sqrt(SW)+epsilon)
b -= learning_rate*Vb/(np.sqrt(Sb)+epsilon)
完整的代码是这样的
# load dataset breast cancer
import sklearn
from sklearn import *
import numpy as np
import matplotlib.pyplot as plt
X,y = sklearn.datasets.load_breast_cancer(return_X_y=True, as_frame=False)
# scaling input
X = (X-np.mean(X,0))/np.std(X,0)
# avoid rank 1 vector
y = y.reshape(len(y),1)
# stat
m = X.shape[0]
n = X.shape[1]
# hyper parameters
num_iter = 20000
learning_rate = 1e-6
beta = 0.9
beta2 = 0.999
epsilon = 1e-8
# init
np.random.seed(42)
W = np.random.randn(n,1)
b = np.random.randn(1)
y = y.reshape(len(y),1)
VW = np.zeros((n,1))
Vb = np.zeros(1)
SW = np.zeros((n,1))
Sb = np.zeros(1)
for i in range(num_iter):
# forward
Z = X.dot(W) + b # m,nclass
# sigmoid
A = 1/(1+np.exp(-Z))
# categorical cross-entropy
# cost = -np.sum(y*np.log(A))/m
# binary classification cost
j = (-y*np.log(A)- (1-y)*np.log(1-A)).sum()*(1/m)
if i % 1000 == 999:
print(i, j)
# backward
# derivative respect to j
dA = (A-y)/(A*(1-A))
dZ = A-y
dW = X.transpose().dot(dZ)
db = dZ.sum()
# momentum
VW = beta*VW + (1-beta)*dW
Vb = beta*Vb + (1-beta)*db
# rmsprop
SW = beta2*SW + (1-beta2)*dW**2
Sb = beta2*Sb + (1-beta2)*db**2
# update weight TODO: Adam doesntwork>
W -= learning_rate*VW/(np.sqrt(SW)+epsilon)
b -= learning_rate*Vb/(np.sqrt(Sb)+epsilon)
print(sklearn.metrics.classification_report(y,np.round(A),target_names=['benign','malignant']))
结果表明,对于这个特殊的问题,RMSProp/Adam的收敛时间比梯度下降要长得多。我的实现是正确的
目前没有回答
相关问题 更多 >
编程相关推荐