在AWS Lambd上使用python中的Pdf2img将pdf页面转换为图像

2024-04-26 19:11:14 发布

您现在位置:Python中文网/ 问答频道 /正文

Lambdahandler代码:

from pdf2image import convert_from_path, convert_from_bytes


def lambda_handler(event, context):
    # TODO implement
    f = "967.pdf"
    images = convert_from_path(f,dpi=150)

    return {
        'statusCode': 200,
        'body': images
    }

我得到了错误-

   {
     "errorMessage": "Unable to get page count. Is poppler installed and in 
                     PATH?",
     "errorType": "PDFInfoNotInstalledError",
     "stackTrace": [
       "  File \"/var/task/lambda_function.py\", line 15, in 
       lambda_handler\n    images = 
       convert_from_path(f,dpi=150,poppler_path=poppler_path)\n",
       "  File \"/opt/python/pdf2image/pdf2image.py\", line 80, in 
       convert_from_path\n    page_count = _page_count(pdf_path, userpw, 
       poppler_path=poppler_path)\n",
       "  File \"/opt/python/pdf2image/pdf2image.py\", line 355, in 
       _page_count\n    \"Unable to get page count. Is poppler installed 
       and in PATH?\"\n"
    ]
   }

Tags: pathlambdainfrompyconvertpdfcount
1条回答
网友
1楼 · 发布于 2024-04-26 19:11:14

Lambda上没有安装Poppler,您必须在部署期间对其进行打包。由于这是一个经常提到的问题,我为这个过程建立了一个存储库:

https://github.com/Belval/pdf2image-as-a-service

如果出于某种原因,您不想使用上述方法,以下是在软件包中构建并包含poppler的一般步骤:

  1. 打造波普勒
  2. 移动bin/目录,libpoppler是包中的一个特定目录
  3. 编辑代码以使用poppler_path

同样,您也可以只阅读as-a-function/amazon/lambda.sh中的脚本

相关问题 更多 >