空间匹配器条件或/和Python

2024-04-29 20:09:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我想对以下关键词进行分类:

import spacy
from spacy.matcher import PhraseMatcher

nlp = spacy.load("en_core_web_sm")
phrase_matcher = PhraseMatcher(nlp.vocab)

cat_patterns = [nlp(text) for text in ('cat', 'cute', 'fat')]
dog_patterns = [nlp(text) for text in ('dog', 'fat')]

matcher = PhraseMatcher(nlp.vocab)
matcher.add('Category1', None, *cat_patterns)
matcher.add('Category2', None, *dog_patterns)

doc = nlp("I have a white cat. It is cute and fat; I have a black dog. It is fat,too")
matches = matcher(doc)
for match_id, start, end in matches:
    rule_id = nlp.vocab.strings[match_id]  # get the unicode ID, i.e. 'CategoryID'
    span = doc[start : end]  # get the matched slice of the doc
    print(rule_id, span.text)

#Output
#Category1 cat
#Category1 cute
#Category1 fat
#Category2 fat
#Category2 dog
#Category1 fat
#Category2 fat

然而,我的预期输出是,如果文本包含cat和cute或cat和fat在一起,它将属于第一类;如果文本包含dog和fat,那么它将属于第二类

#Category1 cat cute
#Category1 cat fat
#Category2 dog fat

是否可以使用类似的算法来实现?多谢各位


Tags: textidcutedocnlpspacymatcherfat