无法下载sparknlp库提供的管道

import sparknlp from sparknlp.pretrained import PretrainedPipeline #create or get Spark Session spark = sparknlp.start() sparknlp.version() spark.version #download, load, and annotate a text by pre-trained pipeline pipeline = PretrainedPipeline('recognize_entities_dl', lang='en') result = pipeline.annotate('Harry Potter is a great movie') 2.1.0 recognize_entities_dl download started this may take some time.

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-13-b71a0f77e93a> in <module> 11 #download, load, and annotate a text by pre-trained pipeline 12 ---> 13 pipeline = PretrainedPipeline('recognize_entities_dl', 'en') 14 result = pipeline.annotate('Harry Potter is a great movie') d:\python36\lib\site-packages\sparknlp\pretrained.py in __init__(self, name, lang, remote_loc) 89 90 def __init__(self, name, lang='en', remote_loc=None): ---> 91 self.model = ResourceDownloader().downloadPipeline(name, lang, remote_loc) 92 self.light_model = LightPipeline(self.model) 93 d:\python36\lib\site-packages\sparknlp\pretrained.py in downloadPipeline(name, language, remote_loc) 50 def downloadPipeline(name, language, remote_loc=None): 51 print(name + " download started this may take some time.") ---> 52 file_size = _internal._GetResourceSize(name, language, remote_loc).apply() 53 if file_size == "-1": 54 print("Can not find the model to download please check the name!") AttributeError: module 'sparknlp.internal' has no attribute '_GetResourceSize'

1条回答

网友

1楼 · 发布于 2024-04-26 01:20:56

感谢您确认您的apachespark版本。预先训练的管道和模型基于apachespark和Spark NLP版本。最低的apachespark版本必须是2.4.x，才能下载预先训练好的模型/管道。否则，您需要为任何版本培训自己的模型/管道。你知道吗

这是Apache Spark 2.4.x的所有管道和它们的列表： https://nlp.johnsnowlabs.com/docs/en/pipelines

如果您查看任何模型或管道的URL，可以看到以下信息：

recognize_entities_dl_en_2.1.0_2.4_1562946909722.zip

名称：recognize_entities_dl
朗：en
火花NLP：必须等于2.1.0或更大
Apache Spark：等于2.4.x或更大

注意：Spark NLP库是根据apachespark2.4.x构建和编译的。这就是为什么模型和管道只能用于2.4.x版本。你知道吗

注意2：因为您使用的是Windows，所以需要使用与Windows兼容的_noncontrib模型和管道：Do Spark-NLP pretrained pipelines only work on linux systems?

我希望这个答案能帮助你解决问题。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章