我正在尝试完成反向传播的代码,最后一步是计算权重和偏差的变化(使用平方成本)。这一步包括在对一个数组进行转置之后对两个数组执行矩阵乘法。你知道吗
# necessary functions for this example
def sigmoid(z):
return 1.0/(1.0+np.exp(-z))
def prime(z):
return sigmoid(z) * (1-sigmoid(z))
def cost_derivative(output_activations, y):
return (output_activations-y)
# Mock weight and bias matrices
weights = [np.array([[ 1, 0, 2],
[2, -1, 0],
[4, -1, 0],
[1, 3, -2],
[0, 0, -1]]),
np.array([[2, 0, -1, -1, 2],
[0, 2, -1, -1, 0]])]
biases = [np.array([-1, 2, 0, 0, 4]), np.array([-2, 1])]
# The mock training example
q = [(np.array([1, -2, 3]), np.array([0, 1])),
(np.array([2, -3, 5]), np.array([1, 0])),
(np.array([3, 6, -1]), np.array([1, 0])),
(np.array([4, -1, -1]), np.array([0, 0]))]
nabla_b = [np.zeros(b.shape) for b in biases]
nabla_w = [np.zeros(w.shape) for w in weights]
for x, y in q:
activation = x
activations = [x]
zs = []
for w, b in zip(weights, biases):
z = np.dot(w, activation) + b
zs.append(z)
activation = sigmoid(z)
activations.append(activation)
# Computation of last layer
delta = cost_derivative(activations[-1], y) * prime(zs[-1])
nabla_b[-1] = delta
nabla_w[-1] = np.dot(np.transpose(activations[-2]), delta) + biases
我已经打印了delta
的输出,第一个实例给出了[ 0.14541528 -0.14808645]
,它是一个1x2矩阵,并且
activations[-2] = [9.97527377e-01 9.97527377e-01 9.97527377e-01 1.67014218e-05 7.31058579e-01]
这是一个1x5矩阵。现在换位activations[-2]
应该得到一个1x5,得到的乘法应该得到一个5x2矩阵,但是没有
目前没有回答
相关问题 更多 >
编程相关推荐