我不理解Pytorch中计数损失函数的维数转换

2024-04-24 03:40:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我是神经网络新手,所以如果我描述错误,我很抱歉。现在,我正在尝试构建一个模型,使用Pytorch按字符生成文本。模型是5个字符的fixedwindow/看起来我的模型还可以,但在我计算损失函数并训练我的模型之后,我得到的结果是同一个字母反复出现

    èJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
    %333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333
    őJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ

我期待着类似的句子。 我对张量维还不是很在行,所以我想我在某个地方把张量调错了

因此,我的模型是:

    self.pad = nn.ZeroPad2d((5, 0, 0, 0))
    self.emb = nn.Embedding(n_tokens, emb_size) #max_norm=True
    self.conv_1 = nn.Conv1d(emb_size, hid_size, kernel_size=6, stride=1, padding=0)
    self.fc = nn.Linear(hid_size, n_tokens)

在这里,我把它转了两次,因为它在其他情况下不起作用:

    input = self.pad(input)
    input = self.emb(input)
    input = input.permute(0, 2, 1)
    input = self.conv_1(input)
    input = input.permute(0, 2, 1)
    input = self.fc(input)

作为输出,我得到的结果是[batch\u size,sequence\u size,number\u of \u token]

当我计算损耗时,我并不真正理解在进入损耗函数之前,张量的维数是什么样子的。所以现在是这样的:

    input = torch.as_tensor(input_ix, dtype=torch.int64)    
    logits = model(input)
    reference_answers = input

    mask = torch.from_numpy(compute_mask(input).to(torch.int32).cpu().numpy()) #mask is needed because I'm making same sequence length for each example.
    
    criterion = nn.CrossEntropyLoss()
    probs = nn.Softmax(dim=1)
    softmax_output = probs(logits)
    mask_ = mask.unsqueeze(-1).expand(softmax_output.size())
    softmax_masked = softmax_output * mask_
    softmax_masked = softmax_masked.permute(0,2,1)
    loss = criterion(softmax_masked, reference_answers)
    return loss 

我觉得我做的每件事都有失公允。我只是想了解,它在现实生活中是如何工作的,因为我没有例子