我想使用数据帧中的标签和单词来训练spacy文本分类器。但是我不能得到 将训练数据右转并将其传递给训练
数据帧示例:
category word score
0 anger fasten 0.0
1 anger morals 1.0
2 anger tributary 0.0
3 anger changer 0.0
4 anger morality 0.0
... ... ... ...
184125 trust amber 0.0
184126 trust pulmonary 0.0
184127 trust ambient 0.0
184128 trust amaze 0.0
184129 trust zoom 0.0
示例代码
TRAIN_DATA = [
# HERE THE TRAIN DATA FROM THE DATAFRAME
# anger : words related with anger
# trust : words related with trust
]
nlp = spacy.load("en_core_web_sm")
category = nlp.create_pipe("textcat", config={"exclusive_classes": True})
nlp.add_pipe(category)
# add label to text classifier
category.add_label("Cat")
category.add_label('False')
optimizer = nlp.begin_training()
losses = {}
for i in range(100):
random.shuffle(TRAIN_DATA)
for batch in minibatch(TRAIN_DATA, size=8):
texts = [nlp(text) for text, entities in batch]
annotations = [{"cats": entities} for text, entities in batch]
nlp.update(texts, annotations, sgd=optimizer, losses=losses)
print(i, losses)
预期输出:
doc = nlp(u'confidence') --> prediction : trust
因此,我用这段代码制作了training_数据格式,但每次训练似乎需要4个小时
相关问题 更多 >
编程相关推荐