用Python在NLP命名实体识别中提取人名

(S (PERSON Larry/NNP) (ORGANIZATION Page/NNP) is/VBZ an/DT (GPE American/JJ) business/NN magnate/NN and/CC computer/NN scientist/NN who/WP is/VBZ the/DT co-founder/NN of/IN (GPE Google/NNP) ,/, alongside/RB (PERSON Sergey/NNP Brin/NNP))

2条回答

网友

1楼 · 编辑于 2024-05-29 06:39:59

很长时间

请仔细阅读以下内容：

了解解决方案，不要只是复制和粘贴。

TL；博士

在终端：

pip install -U nltk

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
unzip stanford-corenlp-full-2016-10-31.zip && cd stanford-corenlp-full-2016-10-31

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
-preload tokenize,ssplit,pos,lemma,parse,depparse \
-status_port 9000 -port 9000 -timeout 15000

在Python中

from nltk.tag.stanford import CoreNLPNERTagger

def get_continuous_chunks(tagged_sent):
    continuous_chunk = []
    current_chunk = []

    for token, tag in tagged_sent:
        if tag != "O":
            current_chunk.append((token, tag))
        else:
            if current_chunk: # if the current chunk is not empty
                continuous_chunk.append(current_chunk)
                current_chunk = []
    # Flush the final current_chunk into the continuous_chunk, if any.
    if current_chunk:
        continuous_chunk.append(current_chunk)
    return continuous_chunk


stner = CoreNLPNERTagger()
tagged_sent = stner.tag('Rami Eid is studying at Stony Brook University in NY'.split())

named_entities = get_continuous_chunks(tagged_sent)
named_entities_str_tag = [(" ".join([token for token, tag in ne]), ne[0][1]) for ne in named_entities]


print(named_entities_str_tag)

[出局]：

[('Rami Eid', 'PERSON'), ('Stony Brook University', 'ORGANIZATION'), ('NY', 'LOCATION')]

你也可以找到这个帮助：Unpacking a list / tuple of pairs into two lists / tuples

网友

2楼 · 编辑于 2024-05-29 06:39:59

首先，您需要下载jar文件和其他必要的文件。按照链接：https://gist.github.com/troyane/c9355a3103ea08679baf。运行代码下载文件（除了最后几行）。下载部分完成后，现在就可以执行提取部分了。

from nltk.tag.stanford import StanfordNERTagger
st = StanfordNERTagger('/home/saheli/Downloads/my_project/stanford-ner/english.all.3class.distsim.crf.ser.gz',
                   '/home/saheli/Downloads/my_project/stanford-ner/stanford-ner.jar')

很长时间

TL；博士

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Python在NLP命名实体识别中提取人名

很长时间

TL；博士

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >