基于关键字的摘要

2024-05-13 22:27:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我想知道是否有一些自动摘要算法处理基于自定义词典的提取。我使用基于textrank的算法已经有一段时间了,但是我想对算法计算的短语排序产生影响。在

示例

"Thomas A. Anderson is a man living two lives. By day he is an average computer programmer and by night a hacker known as Neo. Neo has always questioned his reality, but the truth is far beyond his imagination. Neo finds himself targeted by the police when he is contacted by Morpheus, a legendary computer hacker branded a terrorist by the government. Morpheus awakens Neo to the real world, a ravaged wasteland where most of humanity have been captured by a race of machines that live off of the humans' body heat and electrochemical energy and who imprison their minds within an artificial reality known as the Matrix. As a rebel against the machines, Neo must return to the Matrix. He must confront the agents: super-powerful computer programs devoted to snuffing out Neo and the entire human rebellion."

我的自定义词典如下所示:

super-powerful: [important]
Thomas A. Anderson: [important]

我的总结应包含以下句子,即使它们的排名低于段落中的其他句子:

  1. "Thomas A. Anderson is a man living two lives"
  2. "He must confront the agents: super-powerful computer programs devoted to snuffing out Neo and the entire human rebellion."

我试图通过在词性标记句中添加额外的标记来达到这一目的,如下所示:

^{pr2}$

但我真的不知道如何告诉textrank算法在带有这些标记的句子中赋予优先级。我使用Python和nltk和yaml来实现这个输出

我们将非常感谢您的帮助!在


Tags: andoftheto算法byisthomas