malt解析器在与n一起使用时给出断言错误

2024-04-29 19:50:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我在pythonnltk中使用malt解析器。我已经成功地下载了培训数据并更新了最新的nltk。当我调用malt解析器时,它会给出一个asertion错误。下面是python的代码,其中也包括回溯。在

 mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m'])

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m'])
  File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 131, in __init__
    self.malt_jars = find_maltparser(parser_dirname)
  File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 72, in find_maltparser
    assert malt_dependencies.issubset(_jars)
AssertionError
>>> 

Tags: in解析器linempfilesjavausersadditional
2条回答

TL;DR(在Python3!!)公司名称:

import urllib.request
urllib.request.urlretrieve('http://www.maltparser.org/mco/english_parser/engmalt.poly-1.7.mco', 'C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco')
urllib.request.urlretrieve('http://maltparser.org/dist/maltparser-1.8.1.zip', 'C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip')
zfile = zipfile.ZipFile('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip')
zfile.extractall('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\')

然后:

^{pr2}$

如果所有的download和环境变量设置都正确,最有可能的原因是文件/目录路径在nltk.parse.malt.py中的https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L69中分割,它专门为linux拆分目录和文件名:

def find_maltparser(parser_dirname):
    """
    A module to find MaltParser .jar file and its dependencies.
    """
    if os.path.exists(parser_dirname): # If a full path is given.
        _malt_dir = parser_dirname
    else: # Try to find path to maltparser directory in environment variables.
        _malt_dir = find_dir(parser_dirname, env_vars=('MALT_PARSER',))
    # Checks that that the found directory contains all the necessary .jar
    malt_dependencies = ['','','']
    _malt_jars = set(find_jars_within_path(_malt_dir))
    _jars = set(jar.rpartition('/')[2] for jar in _malt_jars)
    malt_dependencies = set(['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar'])

    assert malt_dependencies.issubset(_jars)
    assert any(filter(lambda i: i.startswith('maltparser-') and i.endswith('.jar'), _jars))
    return list(_malt_jars)

错误已修复,并且正在https://github.com/nltk/nltk/pull/1292处合并

更改此行:

^{pr2}$

这应该可以解决您的问题=)

_jars = set(os.path.split(jar)[1] for jar in _malt_jars)

答案与代码本身无关,而是如何设置环境变量或下载并保存malt解析器目录或文件,请参见https://github.com/nltk/nltk/issues/1294

相关问题 更多 >