为什么在线性层之后使用ReLu激活时精度会降低

# custom class neural network class FashionMnistClassifier(nn.Module): def __init__(self, n_inputs, n_out): super().__init__() self.cnn1 = nn.Conv2d(n_inputs, out_channels=32, kernel_size=5).cuda(device) self.cnn2 = nn.Conv2d(32, out_channels=64, kernel_size=5).cuda(device) #self.cnn3 = nn.Conv2d(n_inputs, out_channels=32, kernel_size=5) self.fc1 = nn.Linear(64*4*4, out_features=100).cuda(device) self.fc2 = nn.Linear(100, out_features=n_out).cuda(device) self.relu = nn.ReLU().cuda(device) self.pool = nn.MaxPool2d(kernel_size=2).cuda(device) self.soft_max = nn.Softmax().cuda(device) def forward(self, x): x.cuda(device) out = self.relu(self.cnn1(x)) out = self.pool(out) out = self.relu(self.cnn2(out)) out = self.pool(out) #print("out shape in classifier forward func: ", out.shape) out = self.fc1(out.view(out.size(0), -1)) #out = self.relu(out) # if I uncomment these then the Accuracy decrease from 90 to 50!!! out = self.fc2(out) #out = self.relu(out) # this too return out n_batch = 100 n_outputs = 10 LR = 0.001 model = FashionMnistClassifier(1, 10).cuda(device) optimizer = optim.Adam(model.parameters(), lr=LR) criterion = nn.CrossEntropyLoss()

1条回答

网友

1楼 · 发布于 2024-04-20 12:55:02

CrossEntropyLoss需要传入非规范化的logit（最后一个Linear层的输出）。你知道吗

如果你用ReLU作为最后一层的输出，你只输出[0, inf)范围内的值，而神经网络倾向于用小值表示错误的标签，用高值表示正确的标签（我们可以说它对自己的预测过于自信）。哦，logit值最高的那个被argmax选为正确的标签。你知道吗

所以这条线肯定不行：

# out = self.relu(out) # this too

尽管它应该在它前面加上ReLU。记住，更多的非线性并不总是对网络有利。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章