我正在创建一个程序,将时装MNIST集作为输入,我正在调整我的模型,看看不同的参数将如何改变精度
我对模型做的一个调整是将模型的损失函数从交叉熵改为MSE
# The code above is miscellaneous training data import code
trainloader = torch.utils.data.DataLoader(trainset, batch_size = 64, shuffle = True, num_workers=4)
testloader = torch.utils.data.DataLoader(testset, batch_size = 64, shuffle = True, num_workers=4)
dataiter = iter(trainloader)
images, labels = dataiter.next()
from torch import nn, optim
import torch.nn.functional as F
model = nn.Sequential(nn.Linear(784, 128),
nn.ReLU(),
nn.Linear(128, 10),
nn.LogSoftmax(dim = 1)
)
model.to(device)
# Define the loss
criterion = nn.MSELoss()
# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr = 0.001)
# Define the epochs
epochs = 5
train_losses, test_losses = [], []
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten Fashion-MNIST images into a 784 long vector
images = images.to(device)
labels = labels.to(device)
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
optimizer.step()
当使用交叉熵损失时,我的模型工作没有任何问题,但当我改为MSE损失时,翻译抱怨说,我的张量大小不同,因此无法计算
<class 'torch.Tensor'>
torch.Size([64, 1, 28, 28])
torch.Size([64])
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-62-ec6942122f02> in <module>
44 output = model.forward(images)
45
---> 46 loss = criterion(output, labels)
47 loss.backward()
48 optimizer.step()
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
429
430 def forward(self, input, target):
--> 431 return F.mse_loss(input, target, reduction=self.reduction)
432
433
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction)
2213 ret = torch.mean(ret) if reduction == 'mean' else torch.sum(ret)
2214 else:
-> 2215 expanded_input, expanded_target = torch.broadcast_tensors(input, target)
2216 ret = torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
2217 return ret
/opt/conda/lib/python3.7/site-packages/torch/functional.py in broadcast_tensors(*tensors)
50 [0, 1, 2]])
51 """
---> 52 return torch._C._VariableFunctions.broadcast_tensors(tensors)
53
54
RuntimeError: The size of tensor a (10) must match the size of tensor b (64) at non-singleton dimension 1
我尝试重新塑造我的张量,并创建新的数组作为输出数组的占位符,但似乎毫无进展
为什么交叉熵损失没有任何误差而MSE没有
^{} 和^{} 是完全不同的损失函数,其背后有着根本不同的原理
^{} 是离散标记任务的损失函数。因此,它期望作为输入预测标签概率,目标作为基本真理离散标签:
x
形状为n
xc
(其中c
是标签的数量),并且y
形状为n
类型整数,每个目标取范围{0,...,c-1}
内的值相反,^{} 是回归任务的损失函数。因此,它期望预测和目标具有相同的形状和数据类型。也就是说,如果您的预测是形状
n
xc
,那么目标也应该是形状n
xc
(而不仅仅是交叉熵情况下的n
)如果坚持使用MSE损失而不是交叉熵,则需要将当前的目标整数标签(形状
n
)转换为形状n
xc
的1-hot vectors,然后才计算预测和生成的一个热目标之间的MSE损失相关问题 更多 >
编程相关推荐