Pytorch获得CUDA错误:训练conv1d分类器时触发deviceside断言

2024-04-20 07:24:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在用PyTorch实现CNN多标签分类器,但它总是显示此错误:

“CUDA错误:设备端断言已触发”。在错误消息中,它指出了损失函数,但当我更改损失函数时,它仍然出现并指向其他部分(就像随机选取的)。当我换到CPU时,它说“self中的索引超出范围”,然而当我调查我的数据加载器时,它并没有什么奇怪的

我有15个类,59462个唯一的令牌,每个文档的len是30000(令牌)

我的模型和损失函数如下:


class model(nn.Module):
    def __init__(self, num_classes=15):
        super(model, self).__init__()
        
        self.embedding = nn.Sequential(nn.Embedding(59462,400),nn.Dropout(0.15))
        self.features = nn.Sequential(
            nn.Conv1d(400, 500, kernel_size=3, stride=1, padding=False), nn.ReLU(),
            nn.Dropout(0.05), nn.MaxPool1d(kernel_size=2), nn.Dropout(0.15))

        self.linear = nn.Linear(500*14999, 15)
        
    def forward(self, x):
        x = self.embedding(x)
        x = x.permute(0,2,1)
        x = self.features(x)
        x = x.view(x.size(0), 500*14999)
        x = self.linear(x)
        
        return x   
    
model = model()
model = model.to(device)

def loss_fn(outputs, targets):
    return torch.nn.BCEWithLogitsLoss(pos_weight=pos_weight.to(device))(outputs, targets).to(device) 

#pos_weight is for my unbalanced data

这是使用CPU时的错误消息:

/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2115             magic_arg_s = self.var_expand(line, stack_depth)
   2116             with self.builtin_trap:
-> 2117                 result = fn(magic_arg_s, cell)
   2118             return result
   2119 

<decorator-gen-60> in time(self, line, cell, local_ns)

/usr/local/lib/python3.6/dist-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    186     # but it's overkill for just that one bit of state.
    187     def magic_deco(arg):
--> 188         call = lambda f, *a, **k: f(*a, **k)
    189 
    190         if callable(arg):

/usr/local/lib/python3.6/dist-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns)
   1191         else:
   1192             st = clock2()
-> 1193             exec(code, glob, local_ns)
   1194             end = clock2()
   1195             out = None

<timed exec> in <module>()

<ipython-input-67-09829392983d> in train_epoch(model, data_loader, loss_fn, optimizer, device, scheduler, n_examples)
     27         targets = d["targets"].to(device)
     28         outputs = model(
---> 29           tokens
     30         )
     31 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

<ipython-input-59-2c2ec554cb05> in forward(self, x)
     16 
     17     def forward(self, x):
---> 18         x = self.embedding(x)
     19         x = x.permute(0,2,1)
     20         x = self.features(x)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py in forward(self, input)
    124         return F.embedding(
    125             input, self.weight, self.padding_idx, self.max_norm,
--> 126             self.norm_type, self.scale_grad_by_freq, self.sparse)
    127 
    128     def extra_repr(self) -> str:

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1850         # remove once script supports set_grad_enabled
   1851         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1852     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1853 
   1854 

IndexError: index out of range in self

有人知道这个问题是什么吗?我如何解决它? 提前谢谢


Tags: inpyselfinputmodellibpackagesusr