将nltk绘制的解析树保存为图像文件

19 投票
4 回答
14417 浏览
提问于 2025-04-18 05:07

这里是图片描述

有没有什么方法可以把tree.draw()绘制的图像保存成一个图片文件呢?我查阅了文档,但没找到相关的信息。

4 个回答

1

如果你想把一个NLTK树保存成图片文件(不管你用什么操作系统),我推荐你使用Constituent-Treelib这个库。它是基于benepar、spaCy和NLTK构建的。首先,你需要通过 pip install constituent-treelib 来安装它。

接下来,按照以下步骤操作:

from nltk import Tree
from constituent_treelib import ConstituentTree

# Define your sentence that should be parsed and saved to a file
sentence = "At least nine tenths of the students passed."

# Rather than a raw string you can also provide an already constructed NLTK tree
sentence = Tree('S', [Tree('NP', [Tree('NP', [Tree('QP', [Tree('ADVP', [Tree('RB', ['At']), Tree('RBS', ['least'])]), Tree('CD', ['nine'])]), Tree('NNS', ['tenths'])]), Tree('PP', [Tree('IN', ['of']), Tree('NP', [Tree('DT', ['the']), Tree('NNS', ['students'])])])]), Tree('VP', [Tree('VBD', ['passed'])]), Tree('.', ['.'])])

# Define the language that should be considered with respect to the underlying benepar and spaCy models 
language = ConstituentTree.Language.English

# You can also specify the desired model for the language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Large

# Create the neccesary NLP pipeline (required to instantiate a ConstituentTree object)
nlp = ConstituentTree.create_pipeline(language, spacy_model_size) 

# In case you haven't downloaded the required benepar an spaCy models, you can tell the method to do it automatically for you
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True) 

# Instantiate a ConstituentTree object and pass it the sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)

# Now you can export the tree to a file (e.g., a PDF)  
tree.export_tree("NLTK_parse_tree.pdf", verbose=True)

>>> PDF-file successfully saved to: NLTK_parse_tree.pdf

结果... 在这里输入图片描述

7

为了补充Minjoon的回答,你可以改变树形图的字体和颜色,让它看起来更像NLTK的.draw()版本,方法如下:

tc['node_font'] = 'arial 14 bold'
tc['leaf_font'] = 'arial 14'
tc['node_color'] = '#005990'
tc['leaf_color'] = '#3F8F57'
tc['line_color'] = '#175252'

左边是修改前,右边是修改后:

before after

17

使用 nltk.draw.tree.TreeView 这个对象可以自动创建画布框架:

>>> from nltk.tree import Tree
>>> from nltk.draw.tree import TreeView
>>> t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
>>> TreeView(t)._cframe.print_to_file('output.ps')

然后:

>>> import os
>>> os.system('convert output.ps output.png')

[output.png]:

enter image description here

13

我也遇到过同样的问题,查看了nltk.draw.tree的源代码后,我找到了一个解决办法:

from nltk import Tree
from nltk.draw.util import CanvasFrame
from nltk.draw import TreeWidget

cf = CanvasFrame()
t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
tc = TreeWidget(cf.canvas(),t)
cf.add_widget(tc,10,10) # (10,10) offsets
cf.print_to_file('tree.ps')
cf.destroy()

输出的文件是一个后记文件,你可以在终端使用ImageMagick把它转换成图片文件:

$ convert tree.ps tree.png

我觉得这个方法比较简单粗暴;可能效率不高,因为它会先显示画布,然后再把它销毁(也许有关闭显示的选项,但我没找到)。如果有更好的方法,请告诉我。

撰写回答