convnet比回归performan更好的分类方法

2024-04-26 02:22:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个5类的数据集,每个类有大约100个256x256的图像。代码与基本的pytorch教程略有不同。使用vgg16我得到了77%的分类准确率。然后通过以下代码行切换架构以进行回归:

model_ft.classifier[6] = nn.Linear(4096, 1) 

这样做,我得到了一个56%的四舍五入分类分数装箱精度。此外,我把我的损失,以MSE导致损失.2698。为什么回归模型的性能比分类模型差?

深入的代码段:

from __future__ import print_function, division

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
import torchvision

import numpy as np
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy
import argparse
import transfer_learning_depths
import vgg_r
plt.ion()   # interactive mode

###############################################################################
def labelToConts(target):
    """ 
    'continous' classes- normalizes classes 0,1,2,3,4 to the domain [0,1]
    """
    target = target.numpy().astype(float)
    for i in range(0,target.size):
        #target[i] = math.floor((target[i] - 2)/2) for neg classes?
        target[i] = ((target[i])/4)
        #print(target[i])
    target = torch.from_numpy(target)
    return target  
###############################################################################  
"""my accuracy metric"""
def my_correct(outputs,labels):
    outputs_4 = outputs * 4
    outputs_4 = torch.round(outputs_4)
    outputs_4 = outputs_4 / 4
    return torch.sum(outputs_4 == labels.data.float())

###############################################################################

def train_model_classify(model,optimizer, scheduler, args):
    since = time.time()
    criterion = nn.CrossEntropyLoss()
    best_model_wts = copy.deepcopy(model.state_dict()) #keep track of best model?
    best_acc = 0.0
    for epoch in range(args.epochs):
        print('Epoch {}/{}'.format(epoch, args.epochs - 1))
        print('-' * 10)
        # Each epoch has a training and validation phase
        for phase in ['train', 'val']: #iterates through train and eval for each epoch
            if phase == 'train':
                scheduler.step()
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode
            running_loss = 0.0
            running_corrects = 0
            # Iterate over data.
            for inputs, labels in dataloaders[phase]: #interesting
                inputs = inputs.to(device) #actually putting data to cpu or gpu #?grab subset of data needed, as not whol 1.2 million images is kept on system in cache
                labels = labels.to(device)
                # zero the parameter gradients
                optimizer.zero_grad()
                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'): 
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)
                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()
                # statistics
                running_loss += loss.item() * inputs.size(0) #loss * batch size
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]
            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))
            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))
    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

##############################################################################  
def train_model_regress(model, optimizer, scheduler, num_epochs):
    since = time.time()
    best_model_wts = copy.deepcopy(model.state_dict()) #keep track of best model?
    best_acc = 0.0       
    criterion = nn.MSELoss(size_average=args.mse)

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']: #iterates through train and eval for each epoch
            if phase == 'train':
                scheduler.step()
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode
            running_loss = 0.0
            running_corrects = 0
            # Iterate over data.
            for inputs, labels in dataloaders[phase]: #interesting
                #manipulating those silly labels
                inputs = inputs.to(device) #actually putting data to cpu or gpu #?grab subset of data needed, as not whol 1.2 million images is kept on system in cache
                labels = labelToConts(labels)
                labels = labels.to(device)
                labels = labels.float()
                labels = labels.view(-1,1)
                #setting up for foward pass
                optimizer.zero_grad()# zero the parameter gradients
                with torch.set_grad_enabled(phase == 'train'): # track history if only in train
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)
                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()
                    #print out final epochs outputs and labels
                    if ((phase == 'val') and (epoch == (num_epochs - 1))):
                        for i in range(outputs.shape[0]):
                            print(outputs[i].item(),labels[i].item())
                            with open("tab_" + args.save_dir , "a") as text_file:
                                text_file.write('{},{}\n'.format(outputs[i].item(),labels[i].item()))
                running_loss += loss.item() * inputs.size(0) #loss * batch size
                running_corrects += my_correct(outputs,labels) #torch.sum(preds == labels.data.long())             
            #relaying results
            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]
            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))
            with open(args.save_dir, "a") as text_file:
                text_file.write('{} , {:.4f} , {:.4f}'.format(
                phase, epoch_loss, epoch_acc))
            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))
    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

##############################################################################
"""arg parse for command prompt"""
parser = argparse.ArgumentParser(description='PyTorch ImageNet Training')
parser.add_argument('-j', '--workers', default=8, type=int, metavar='N',
                    help='number of data loading workers (default: 0)')
parser.add_argument('--epochs', default=10, type=int, metavar='N',
                    help='number of total epochs to run')
parser.add_argument('-b', '--batch-size', default=8, type=int,
                    metavar='N', help='mini-batch size (default: 128)')
parser.add_argument('--lrc', default=1e-3, type=float,
                    metavar='LR', help='initial learning rate')
parser.add_argument('--lrr', default=5e-5, type=float,
                    metavar='LR', help='initial learning rate')
parser.add_argument('--momentum', default=0.9, type=float, metavar='M',
                    help='momentum')
parser.add_argument('--step-size', default=7, type=int, dest='step_size',
                    help='step size for learning rate decay')
parser.add_argument('--gamma', default=0.1, type=float,  metavar='M',
                    help='lr decay per step size')
parser.add_argument('--data-dir', dest='data_dir',
                    help='The directory used to load dataset',
                    default='sample/256_dataset', type=str)
parser.add_argument('--save-dir', dest='save_dir',
                    help='The directory used to save the trained models',
                    default='save_temp_r.txt', type=str)
parser.add_argument('--archs', metavar='N',
                    help='a case loop for transfer learning architecture (1 through 5)',
                    default=1, type=int)
parser.add_argument('--angle-rotation', dest='angle_rotation',
                    help='randomly rotate stuff',
                    default=15, type=int)
parser.add_argument('--mse', default=False, action='store_true', help='Bool type')

##############################################################################
"""setting up datasets"""
# Data augmentation and normalization for synthetic training set images
args = parser.parse_args()
data_transforms = { 
    'train': transforms.Compose([
        transforms.RandomAffine((args.angle_rotation * -1, args.angle_rotation)),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

data_dir = args.data_dir
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir,  x), #both training and testing data set?
                                          data_transforms[x])
                  for x in ['train', 'val']} 
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=args.batch_size, #dataset => loader
                                             shuffle=True, num_workers=args.workers)
              for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} #abstract of multiple classes: size of each set
class_names = image_datasets['train'].classes #labels

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #gpu or cpu
##############################################################################
"""main method"""
#sanity check 
with open(args.save_dir, "w") as text_file: #text file for logging (smaller than saving model params)
    text_file.write(str(args))
print(str(args))
"""classify"""
model_ft = models.vgg16_bn(pretrained=True)
model_ft.classifier[6] = nn.Linear(4096, 5)

optimizer_ft = optim.SGD(model_ft.parameters(), lr=args.lrc, momentum=args.momentum) # Observe that all parameters are being optimized
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=args.step_size, gamma=args.gamma) # Decay LR by a factor of 0.1 every 7 epochs
model_ft = model_ft.to(device) #move modelto CPU or GPU
model_ft = train_model_classify(model_ft, optimizer_ft, exp_lr_scheduler,args)
"""regression"""
for param in model_ft.features.parameters():
    param.requires_grad = False
model_ft.classifier[6] = nn.Linear(4096, 1)   

optimizer_ft = optim.SGD(model_ft.classifier.parameters(), lr=args.lrr, momentum=args.momentum)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=args.step_size, gamma=args.gamma)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #gpu or cpu
model_ft = model_ft.to(device) #move to cpu or gpu
#go time
model_ft = train_model_regress(model_ft, optimizer_ft, exp_lr_scheduler,
                       num_epochs=args.epochs)

更新1: 这是实验的重新运行,所以结果与上面略有不同。 分类结果

Namespace(angle_rotation=15, archs=1, batch_size=8, data_dir='sample/256_dataset', epochs=10, gamma=0.1, lrc=0.001, lrr=5e-05, momentum=0.9, mse=False, save_dir='save_temp_r.txt', step_size=7, workers=8)
Epoch 0/9
----------
train Loss: 1.2778 Acc: 0.4402
val Loss: 1.1267 Acc: 0.5569
Epoch 1/9
----------
train Loss: 1.0335 Acc: 0.5915
val Loss: 0.7418 Acc: 0.6228
Epoch 2/9
----------
train Loss: 0.8455 Acc: 0.6672
val Loss: 0.6596 Acc: 0.7186
Epoch 3/9
----------
train Loss: 0.8310 Acc: 0.6732
val Loss: 0.9665 Acc: 0.6527
Epoch 4/9
----------
train Loss: 0.7381 Acc: 0.7110
val Loss: 0.5091 Acc: 0.8144
Epoch 5/9
----------
train Loss: 0.6798 Acc: 0.7322
val Loss: 1.7675 Acc: 0.4850
Epoch 6/9
----------
train Loss: 0.6773 Acc: 0.7489
val Loss: 0.4319 Acc: 0.8323
Epoch 7/9
----------
train Loss: 0.6199 Acc: 0.7610
val Loss: 0.4467 Acc: 0.8383
Epoch 8/9
----------
train Loss: 0.5240 Acc: 0.8094
val Loss: 0.4593 Acc: 0.8323
Epoch 9/9
----------
train Loss: 0.5431 Acc: 0.7761
val Loss: 0.5307 Acc: 0.7784
Training complete in 10m 53s
Best val Acc: 0.838323

回归结果

Epoch 0/9
----------
train Loss: 1.1029 Acc: 0.2753
val Loss: 0.7975 Acc: 0.2754
Epoch 1/9
----------
train Loss: 0.7637 Acc: 0.3222
val Loss: 0.4878 Acc: 0.4611
Epoch 2/9
----------
train Loss: 0.7492 Acc: 0.3147
val Loss: 0.7000 Acc: 0.3293
Epoch 3/9
----------
train Loss: 0.6129 Acc: 0.3782
val Loss: 0.9696 Acc: 0.2156
Epoch 4/9
----------
train Loss: 0.5920 Acc: 0.3676
val Loss: 0.4152 Acc: 0.3892
Epoch 5/9
----------
train Loss: 0.6302 Acc: 0.3464
val Loss: 0.8849 Acc: 0.1916
Epoch 6/9
----------
train Loss: 0.5896 Acc: 0.3707
val Loss: 0.4919 Acc: 0.2874
Epoch 7/9
----------
train Loss: 0.5016 Acc: 0.3722
val Loss: 0.4235 Acc: 0.3174
Epoch 8/9
----------
train Loss: 0.4701 Acc: 0.3949
val Loss: 0.4893 Acc: 0.3413
Epoch 9/9
----------
val Loss: 0.5068 Acc: 0.3114
Training complete in 4m 43s
Best val Acc: 0.461078

Tags: infordatasizelabelsmodelargstrain
1条回答
网友
1楼 · 发布于 2024-04-26 02:22:58

所以,用vgg得到77%的结果,用线性网络得到56%。这似乎是合法的。你的最小均方误差为0.2608,对吗?你能公布准确度结果吗?你知道吗

回归结果IMO无论如何都应该更糟,因为回归使用MSE损失,它假设标签的高斯分布。然而,在分类问题中,我们知道标签不是高斯的,而是Bernoulli(0,1)。因此,交叉熵是一个更好的指标,以尽量减少比均方误差损失,从而提供更好的性能。你知道吗

相关问题 更多 >