Pytorch：从矩阵元素和到叶变量的反向传播

#Works a = torch.tensor([1.]) a.requires_grad = True b = torch.tensor([1.]) c = torch.cat([a,b]) d = torch.sum(c) d.backward() print('a gradient is') print(a.grad) #=> Tensor([1.]) #Doesn't work a = torch.tensor([1.]) a.requires_grad = True a = a.reshape(a.shape) b = torch.tensor([1.]) c = torch.cat([a,b]) d = torch.sum(c) d.backward() print('a gradient is') print(a.grad) #=> None

1条回答

网友

1楼 · 发布于 2024-04-27 03:20:55

编辑：

下面是对发生的事情的详细解释（“这本身不是一个bug，但它绝对是一个混淆的来源”）：https://github.com/pytorch/pytorch/issues/19778

因此，一个解决方案是明确要求现在保留grad非叶a：

a = torch.tensor([1.])
a.requires_grad = True
a = a.reshape(a.shape)
a.retain_grad()
b = torch.tensor([1.])
c = torch.cat([a,b])
d = torch.sum(c)
d.backward()

老答案：

如果在整形之后移动a.requires_grad = True，则可以：

^{pr2}$

似乎是Pythorch中的一个bug，因为在此之后a.requires_grad仍然有效。在

a = torch.tensor([1.])
a.requires_grad = True
a = a.reshape(a.shape)

这似乎与这样一个事实有关，即a不再是“不工作”示例中的叶，但在其他情况下仍然是叶（打印a.is_leaf以检查）。在

相关问题更多 >

编程相关推荐

热门问题

热门文章