在PyTorch中对模型重新参数化

Question

我正在尝试优化一个简单模型的参数，这个模型是用PyTorch库实现的。为了优化，我想用一种不同的方式来表示这些参数，而不是模型类中指定的那种方式。具体来说，我想把我的参数表示成一个单一的向量，而不是像这个例子中那样用两个向量。

我可以通过使用torch.nn.utils.convert_parameters中的parameters_to_vector将model.parameters()（这是一个Iterable）转换成我想要的向量表示。不过，当我试图把这个向量标记为“叶子节点”（使用detach和requires_grad_），并用vector_to_parameters把它放回模型的原始参数时，似乎计算图并不知道发生了什么。

#!/usr/bin/python3
import torch
import torch.nn as nn
from torch.nn.utils.convert_parameters import *

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.linear = nn.Linear(10, 1)

    def forward(self, x):
        return self.linear(x)

model = SimpleModel()

for name, param in model.named_parameters():
    print(name, param.size())
# this prints:
## linear.weight torch.Size([1, 10])
## linear.bias torch.Size([1])

loss_function = nn.MSELoss()

vparams = parameters_to_vector(model.parameters()).detach().clone().requires_grad_(True)

# populate model.parameters() from vparams
vector_to_parameters(vparams, model.parameters())

input_data = torch.randn(1, 10)
output = model(input_data)
target = torch.randn(1, 1)

loss = loss_function(output, target)
# loss.backward()  ## if we do this, then vparams.grad is None

## this one works, but we wanted to use vparams:
# vgrads = torch.autograd.grad(loss, model.linear.weight)[0]   

## this gives an error:
vgrads = torch.autograd.grad(loss, vparams)[0]
## "One of the differentiated Tensors appears to not have been used in the graph."

我也尝试手动进行向量切片，但这并没有解决错误。例如：

model.linear.weight.data.copy_(vparams[0:10].view_as(model.linear.weight.data))

或者

model.linear.weight = nn.Parameter(vparams[0:10].view_as(model.linear.weight.data))

我对PyTorch还比较陌生，但我听说在PyTorch中可以通过切片计算梯度，所以我觉得我尝试的应该是可行的。

我是不是对PyTorch模型使用的torch.nn.Parameter类有什么误解？这个类的成员是否必须在计算图中是“叶子节点”？下面是一个更小的例子，它省略了nn.Module子类，只是尝试从一个“切片”创建一个nn.Parameter对象：

import torch
import torch.nn as nn

a = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

#b = nn.Parameter(a[1]) # "RuntimeError: One of the differentiated Tensors appears to not have been used in the graph."
#b = torch.Tensor(a[1]) # "IndexError: slice() cannot be applied to a 0-dim tensor."
b = a[1] # works

g = torch.autograd.grad(b, a)[0]

从错误信息来看，似乎PyTorch无法通过nn.Parameter的初始化来进行区分。有没有办法解决这个问题？

向量表示叶子节点模型优化切片操作参数重参数化计算图梯度计算 PyTorch库

在PyTorch中对模型重新参数化

1 个回答

撰写回答