训练后的BERT模型在预测部署中的应用

2024-03-29 09:53:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在从事文本数据的多标签分类任务。 我有一个带有ID列、文本列和多个列的数据框,这些列是只包含1或0的文本标签

我使用了网站Kaggle Toxic Comment Classification using Bert上提出的现有解决方案,该解决方案允许以百分比表示其属于每个标签的程度

现在,我已经训练了我的模型,我想在几个没有标签的文本提取上测试它,以获得属于每个标签的百分比:

我尝试过这个解决方案:

def getPrediction(in_sentences):
  label = ['S1, S2, S3']
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label=label) for x in in_sentences]
  input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

pred_sentences = [
  "here is an exemple of sentence"]

pred_sentences = ''.join(pred_sentences)

predictions = getPrediction(pred_sentences)

我得到:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-490-770bf0871d3e> in <module>
----> 1 predictions = getPrediction(pred_sentences)

<ipython-input-486-3de7328d60db> in getPrediction(in_sentences)
      2   label = ['S1','S2',
      3    'S3']
----> 4   input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
      5   input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
      6   predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)

<ipython-input-486-3de7328d60db> in <listcomp>(.0)
      2   label = ['S1,
      3    S2,S3']
----> 4   input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
      5   input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
      6   predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)

TypeError: __init__() got an unexpected keyword argument 'labels'

你知道我需要改变什么才能使算法的最后一部分正常工作吗


Tags: runtextinfalseinputsentences标签length
1条回答
网友
1楼 · 发布于 2024-03-29 09:53:56

您输入了一个错误,InputExample需要一个名为label的关键字参数,而不是labels

[run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
                                                                      ^

相关问题 更多 >