我目前正在从事文本数据的多标签分类任务。 我有一个带有ID列、文本列和多个列的数据框,这些列是只包含1或0的文本标签
我使用了网站Kaggle Toxic Comment Classification using Bert上提出的现有解决方案,该解决方案允许以百分比表示其属于每个标签的程度
现在,我已经训练了我的模型,我想在几个没有标签的文本提取上测试它,以获得属于每个标签的百分比:
我尝试过这个解决方案:
def getPrediction(in_sentences):
label = ['S1, S2, S3']
input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label=label) for x in in_sentences]
input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
predictions = estimator.predict(predict_input_fn)
return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]
pred_sentences = [
"here is an exemple of sentence"]
pred_sentences = ''.join(pred_sentences)
predictions = getPrediction(pred_sentences)
我得到:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-490-770bf0871d3e> in <module>
----> 1 predictions = getPrediction(pred_sentences)
<ipython-input-486-3de7328d60db> in getPrediction(in_sentences)
2 label = ['S1','S2',
3 'S3']
----> 4 input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
5 input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
6 predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
<ipython-input-486-3de7328d60db> in <listcomp>(.0)
2 label = ['S1,
3 S2,S3']
----> 4 input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, labels=label) for x in in_sentences]
5 input_features = run_classifier.convert_examples_to_features(input_examples, LABEL_COLUMNS, MAX_SEQ_LENGTH, tokenizer)
6 predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
TypeError: __init__() got an unexpected keyword argument 'labels'
你知道我需要改变什么才能使算法的最后一部分正常工作吗
您输入了一个错误,
InputExample
需要一个名为label
的关键字参数,而不是labels
:相关问题 更多 >
编程相关推荐