梯度下降权重不断变大
为了更好地理解梯度下降算法,我尝试自己创建一个线性回归模型。对于少量数据点,它运行得很好。但是当我用更多数据来训练它时,w0和w1的值总是变得越来越大。有人能解释一下这个现象吗?
class LinearRegression:
def __init__(self, x_vector, y_vector):
self.x_vector = np.array(x_vector, dtype=np.float64)
self.y_vector = np.array(y_vector, dtype=np.float64)
self.w0 = 0
self.w1 = 0
def _get_predicted_values(self, x):
formula = lambda x: self.w0 + self.w1 * x
return formula(x)
def _get_gradient_matrix(self):
predictions = self._get_predicted_values(self.x_vector)
w0_hat = sum((self.y_vector - predictions))
w1_hat = sum((self.y_vector - predictions) * self.x_vector)
gradient_matrix = np.array([w0_hat, w1_hat])
gradient_matrix = -2 * gradient_matrix
return gradient_matrix
def fit(self, step_size=0.001, num_iterations=500):
for _ in range(1, num_iterations):
gradient_matrix = self._get_gradient_matrix()
self.w0 -= step_size * (gradient_matrix[0])
self.w1 -= step_size * (gradient_matrix[1])
def _show_coeffiecients(self):
print(f"w0: {self.w0}\tw1: {self.w1}\t")
def predict(self, x):
y = self.w0 + self.w1 * x
return y
# This works fine
x = [x for x in range(-3, 3)]
f = lambda x: 5 * x - 7
y = [f(x_val) for x_val in x]
model = LinearRegression(x, y)
model.fit(num_iterations=3000)
model.show_coeffiecients() #output : w0: -6.99999999999994 w1: 5.00000000000002
#While this doesn't
x = [x for x in range(-50, 50)] # Increased the number of x values
f = lambda x: 5 * x - 7
y = [f(x_val) for x_val in x]
model = LinearRegression(x, y)
model.fit(num_iterations=3000)
model.show_coeffiecients()
最后一行产生了一个警告:
RuntimeWarning: overflow encountered in multiply
w1_hat = sum((self.y_vector - predictions) * self.x_vector)
formula = lambda x: self.w0 + self.w1 * x
1 个回答
1
这里有两种解决方案:
- 如果我们在讨论均方误差(MSE)及其导数,那么你的代码里缺少了一件事——要除以样本的数量。你得到的梯度值比较大,这可能是你无法达到成本函数最小值的原因。所以我建议你试试这个:
gradient_matrix = -2 * gradient_matrix / len(self.x_vector)
- 如果你真的想继续使用“(未归一化的)平方误差”,那么可以把
step_size
的值调小,这样可以降低梯度值,避免错过函数的最小值。