Spacy LIKE_NUM cast to it's python number等价

2024-03-29 06:29:54 发布

您现在位置:Python中文网/ 问答频道 /正文

spacy是否提供了从LIKE_NUM标记到python浮点十进制的快速转换。Spacy可以匹配LIKE_NUM标记,如“31,2”、“10.9”、“10”、“10”等。它是否也提供了一种快速获取python编号的方法?我期望像.get_value()这样的方法返回数字(不是字符串),但我找不到任何数字。你知道吗

nlp = spacy.load('en_core_web_sm')
matcher = Matcher(nlp.vocab) 
text = "this is just a text and a number 10,2 or 10.2 meaning ten point two"
doc = nlp(text)

pattern = [{"LIKE_NUM": True}]

matcher.add("number_match", None, pattern)

matches = matcher(doc)
print("All matches:")
for match_id, start, end in matches:
    string_id = nlp.vocab.strings[match_id]  # Get string representation
    span = doc[start:end]  # The matched span
    print(match_id, string_id, start, end, span.text)

    print(type(span.text))

输出为:

All matches:
13316671205374851783 number_match 8 9 10,2
<class 'str'>
13316671205374851783 number_match 10 11 10.2
<class 'str'>
13316671205374851783 number_match 12 13 ten
<class 'str'>
13316671205374851783 number_match 14 15 two
<class 'str'>

Tags: textidnumberdocnlpmatchmatcherstart