提取位移(spacy)输出相关连接

2024-04-25 23:49:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用spacy的displace可视化工具来查看句子中单词之间的依赖关系。看起来是这样的:

text = 'European authorities fined Google a record $5.1 billion on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices print(displacy.render(nlp(text), jupyter=True, style='ent')) print(displacy.render(nlp(text), style='dep', jupyter = True, options = {'distance': 120}))

enter image description here

是否仍然可以通过索引字符串中的单词来提取箭头所建立的连接?例如,在下图中,查看“欧洲当局罚款谷歌”中的连接。是否仍要生成以下数据帧(word列中的每个单词,以及word连接到的每个单词在connection列中)?:

word       |   connection
---------------------------
European   |   
Authorities| European
fined      | Authorities, Google, record, ..., ...
Google     | 

Tags: thetexttruenlpgooglejupyterrenderrecord
1条回答
网友
1楼 · 发布于 2024-04-25 23:49:52

Spacyprovides有许多属性可用于此目的,如ancestorschildren。请注意,这些属性返回生成器,因此需要将它们转换为列表,然后转换为字符串

下面是一个示例,我使用了children属性

text = 'European authorities fined Google a record $5.1 billion on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices'
doc = nlp(text)
words = []
a_network = []
for w in doc:
  words.append(w)
  network = [t.text for t in list(w.children)]
  a_network.append(", ".join(network))

df = pd.DataFrame({"word":words,"network":a_network})

print(df)

输出将是

           word                               network
0      European                                      
1   authorities                              European
2         fined  authorities, Google, record, on, for
3        Google                                      
4             a                                      
5        record                            a, billion
6             $                                      
7           5.1                                      
8       billion                                $, 5.1
...

相关问题 更多 >

    热门问题