我正在学习一个关于神经网络的在线教程,neuralnetworksanddeeplearning.com作者Nielsen在代码中实现了L2正则化,作为本教程的一部分。现在他要求我们修改代码,使其使用L1正则化而不是L2正则化。这个link将带您直接进入我所说的教程部分。你知道吗
尼尔森用python实现了它:
self.weights = [(1-eta*(lmbda/n))*w-(eta/len(mini_batch))*nw
for w, nw in zip(self.weights, nabla_w)]
具有L1正则化的更新规则变为:
我试着实现它如下:
self.weights = [(w - eta* (lmbda/len(mini_batch)) * np.sign(w) - (eta/len(mini_batch)) * nw)
for w, nw in zip(self.weights, nabla_w)]
突然我的神经网络有了一个+-机会的分类精度。。。怎么会这样?我在执行L1正则化时是否犯了错误?我有一个有30个隐藏神经元的神经网络,学习率为0.5,lambda=5.0。当我使用L2正则化时,一切都很好。你知道吗
为方便起见,请在此处找到整个更新功能:
def update_mini_batch(self, mini_batch, eta, lmbda, n):
"""Update the network's weights and biases by applying gradient
descent using backpropagation to a single mini batch. The
``mini_batch`` is a list of tuples ``(x, y)``, ``eta`` is the
learning rate, ``lmbda`` is the regularization parameter, and
``n`` is the total size of the training data set.
"""
nabla_b = [np.zeros(b.shape) for b in self.biases]
nabla_w = [np.zeros(w.shape) for w in self.weights]
for x, y in mini_batch:
delta_nabla_b, delta_nabla_w = self.backprop(x, y)
nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
self.weights = [(1-eta*(lmbda/n))*w-(eta/len(mini_batch))*nw
for w, nw in zip(self.weights, nabla_w)]
self.biases = [b-(eta/len(mini_batch))*nb
for b, nb in zip(self.biases, nabla_b)]
你算错了。要实现的公式的代码转换为:
所需的两个修改是:
相关问题 更多 >
编程相关推荐