我想对以下关键词进行分类:
import spacy
from spacy.matcher import PhraseMatcher
nlp = spacy.load("en_core_web_sm")
phrase_matcher = PhraseMatcher(nlp.vocab)
cat_patterns = [nlp(text) for text in ('cat', 'cute', 'fat')]
dog_patterns = [nlp(text) for text in ('dog', 'fat')]
matcher = PhraseMatcher(nlp.vocab)
matcher.add('Category1', None, *cat_patterns)
matcher.add('Category2', None, *dog_patterns)
doc = nlp("I have a white cat. It is cute and fat; I have a black dog. It is fat,too")
matches = matcher(doc)
for match_id, start, end in matches:
rule_id = nlp.vocab.strings[match_id] # get the unicode ID, i.e. 'CategoryID'
span = doc[start : end] # get the matched slice of the doc
print(rule_id, span.text)
#Output
#Category1 cat
#Category1 cute
#Category1 fat
#Category2 fat
#Category2 dog
#Category1 fat
#Category2 fat
然而,我的预期输出是,如果文本包含cat和cute或cat和fat在一起,它将属于第一类;如果文本包含dog和fat,那么它将属于第二类
#Category1 cat cute
#Category1 cat fat
#Category2 dog fat
是否可以使用类似的算法来实现?多谢各位
目前没有回答
相关问题 更多 >
编程相关推荐