我遵循本教程(https://mccormickml.com/2019/07/22/BERT-fine-tuning/#a1-saving--loading-fine-tuned-model)对BertForSequenceClassification进行微调。训练完模型后,我想加载这个模型来编写一个函数“classify_Session(句子)”:它接受一个句子并返回预测的logit向量
def classify_sentence(self, sentence):
self.model = BertForSequenceClassification.from_pretrained(output_dir)
self.tokenizer = BertTokenizer.from_pretrained(output_dir)
encoded_dict = self.tokenizer.encode_plus(
sentence, # Sentence to encode.
add_special_tokens = True, # Add '[CLS]' and '[SEP]'
max_length = 64, # Pad & truncate all sentences.
pad_to_max_length = True,
return_attention_mask = True, # Construct attn. masks.
return_tensors = 'pt', # Return pytorch tensors.
)
# Add the encoded sentence to the list.
input_id = encoded_dict['input_ids']
# And its attention mask (simply differentiates padding from non-padding).
attention_mask = encoded_dict['attention_mask']
input_id = torch.cat(input_id, dim=0)
attention_mask = torch.cat(attention_mask, dim=0)
with torch.no_grad():
output = self.model(input_id,
token_type_ids=None,
attention_mask=attention_mask
)
logits = outputs[0]
return logits
output_dir是一个包含以下文件的目录:config.json、pytorch_model.bin、special_tokens_map.json、tokenizer_config.json和vocab.txt
运行此函数时,我得到一个错误:
AttributeError:“BertTokenizer”对象没有属性“encode\u plus”
然而,我在火车上用这种方法对句子进行编码。加载经过训练的BERT模型后,是否有其他方法标记句子
目前没有回答
相关问题 更多 >
编程相关推荐