Pytorch：交叉熵的维数是正确的，但MSE的维数多少有点错误？

# The code above is miscellaneous training data import code trainloader = torch.utils.data.DataLoader(trainset, batch_size = 64, shuffle = True, num_workers=4) testloader = torch.utils.data.DataLoader(testset, batch_size = 64, shuffle = True, num_workers=4) dataiter = iter(trainloader) images, labels = dataiter.next() from torch import nn, optim import torch.nn.functional as F model = nn.Sequential(nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 10), nn.LogSoftmax(dim = 1) ) model.to(device) # Define the loss criterion = nn.MSELoss() # Define the optimizer optimizer = optim.Adam(model.parameters(), lr = 0.001) # Define the epochs epochs = 5 train_losses, test_losses = [], [] for e in range(epochs): running_loss = 0 for images, labels in trainloader: # Flatten Fashion-MNIST images into a 784 long vector images = images.to(device) labels = labels.to(device) images = images.view(images.shape[0], -1) # Training pass optimizer.zero_grad() output = model.forward(images) loss = criterion(output, labels) loss.backward() optimizer.step()

<class 'torch.Tensor'> torch.Size([64, 1, 28, 28]) torch.Size([64]) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-62-ec6942122f02> in <module> 44 output = model.forward(images) 45 ---> 46 loss = criterion(output, labels) 47 loss.backward() 48 optimizer.step() /opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 530 result = self._slow_forward(*input, **kwargs) 531 else: --> 532 result = self.forward(*input, **kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result) /opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target) 429 430 def forward(self, input, target): --> 431 return F.mse_loss(input, target, reduction=self.reduction) 432 433 /opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction) 2213 ret = torch.mean(ret) if reduction == 'mean' else torch.sum(ret) 2214 else: -> 2215 expanded_input, expanded_target = torch.broadcast_tensors(input, target) 2216 ret = torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction)) 2217 return ret /opt/conda/lib/python3.7/site-packages/torch/functional.py in broadcast_tensors(*tensors) 50 [0, 1, 2]]) 51 """ ---> 52 return torch._C._VariableFunctions.broadcast_tensors(tensors) 53 54 RuntimeError: The size of tensor a (10) must match the size of tensor b (64) at non-singleton dimension 1

1条回答

网友

1楼 · 发布于 2024-04-24 03:22:08

^{}和^{}是完全不同的损失函数，其背后有着根本不同的原理

^{}是离散标记任务的损失函数。因此，它期望作为输入预测标签概率，目标作为基本真理离散标签：x形状为nxc（其中c是标签的数量），并且y形状为n类型整数，每个目标取范围{0,...,c-1}内的值

相反，^{}是回归任务的损失函数。因此，它期望预测和目标具有相同的形状和数据类型。也就是说，如果您的预测是形状nxc，那么目标也应该是形状nxc（而不仅仅是交叉熵情况下的n）

如果坚持使用MSE损失而不是交叉熵，则需要将当前的目标整数标签（形状n）转换为形状nxc的1-hot vectors，然后才计算预测和生成的一个热目标之间的MSE损失

相关问题更多 >

编程相关推荐

热门问题

热门文章