在Pytorch中,如何在激活中添加L1正则化子?

2024-04-19 20:54:29 发布

您现在位置:Python中文网/ 问答频道 /正文

(这里是Pythorch初学者)

我想将L1正则化器添加到ReLU的激活输出中。 更一般地说,如何只将正则化器添加到网络中的特定层?

此帖子可能与以下内容有关: Adding L1/L2 regularization in PyTorch? 但要么与此无关,要么我不明白答案:

它指的是在优化中应用的L2正则化器,这是另一回事。 换句话说,如果总的期望损失是

crossentropy + lambda1*L1(layer1) + lambda2*L1(layer2) + ...

我相信提供给torch.optim.Adagrad的参数只适用于交叉熵损失。 或者它可能应用于网络中的所有参数(权重)。但无论如何 似乎不允许将正则化器应用于单个激活层, 不提供L1损失。

另一个相关主题是nn.modules.loss,它包括L1Loss()。 从文档中,我还不知道如何使用这个。

最后,有一个模块https://github.com/pytorch/pytorch/blob/master/torch/legacy/nn/L1Penalty.py似乎最接近目标,但它被称为“遗留”。为什么?


Tags: 网络l1参数nntorchpytorch帖子relu
2条回答

以下是您的操作方法:

  • 在模块的前向返回最终输出和要应用L1正则化的层的输出中
  • loss变量将是输出w.r.t.目标的交叉熵损失和L1惩罚的总和。

下面是一个示例代码

import torch
from torch.autograd import Variable
from torch.nn import functional as F


class MLP(torch.nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.linear1 = torch.nn.Linear(128, 32)
        self.linear2 = torch.nn.Linear(32, 16)
        self.linear3 = torch.nn.Linear(16, 2)

    def forward(self, x):
        layer1_out = F.relu(self.linear1(x))
        layer2_out = F.relu(self.linear2(layer1_out))
        out = self.linear3(layer2_out)
        return out, layer1_out, layer2_out

batchsize = 4
lambda1, lambda2 = 0.5, 0.01

model = MLP()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

# usually following code is looped over all batches 
# but let's just do a dummy batch for brevity

inputs = Variable(torch.rand(batchsize, 128))
targets = Variable(torch.ones(batchsize).long())

optimizer.zero_grad()
outputs, layer1_out, layer2_out = model(inputs)
cross_entropy_loss = F.cross_entropy(outputs, targets)

all_linear1_params = torch.cat([x.view(-1) for x in model.linear1.parameters()])
all_linear2_params = torch.cat([x.view(-1) for x in model.linear2.parameters()])
l1_regularization = lambda1 * torch.norm(all_linear1_params, 1)
l2_regularization = lambda2 * torch.norm(all_linear2_params, 2)

loss = cross_entropy_loss + l1_regularization + l2_regularization
loss.backward()
optimizer.step()

@Sasank奇拉姆库蒂 正则化应该是模型每一层的加权参数,而不是每一层的输出。请看下面: Regularization

import torch
from torch.autograd import Variable
from torch.nn import functional as F


class MLP(torch.nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.linear1 = torch.nn.Linear(128, 32)
        self.linear2 = torch.nn.Linear(32, 16)
        self.linear3 = torch.nn.Linear(16, 2)
    def forward(self, x):
        layer1_out = F.relu(self.linear1(x))
        layer2_out = F.relu(self.linear2(layer1_out))
        out = self.linear3(layer2_out)
        return out

batchsize = 4
lambda1, lambda2 = 0.5, 0.01

model = MLP()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

inputs = Variable(torch.rand(batchsize, 128))
targets = Variable(torch.ones(batchsize).long())
l1_regularization, l2_regularization = torch.tensor(0), torch.tensor(0)

optimizer.zero_grad()
outputs = model(inputs)
cross_entropy_loss = F.cross_entropy(outputs, targets)
for param in model.parameters():
    l1_regularization += torch.norm(param, 1)
    l2_regularization += torch.norm(param, 2)

loss = cross_entropy_loss + l1_regularization + l2_regularization
loss.backward()
optimizer.step()

相关问题 更多 >