Python PyPDF2在扫描的PDF中计算PDF页面生成的外部参照表不是零索引

from PyPDF2 import PdfFileReader from pathlib import Path import os import math import logging numPages=0 workPath = input ('Please introduce your working directory: ') print ('Your selected path is ' + workPath) os.chdir (workPath.encode()) logging.basicConfig(filename='errrors.log', level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s') fout= open('PagesCount.txt', 'w', encoding="utf-8") path_files = Path(workPath) for file in path_files.glob('**/*.pdf'): page_Count = 0 try: with open(str(file), "br") as PDF: try: page_Count = PdfFileReader(PDF).getNumPages() numPages = numPages + page_Count print ('Pages in ' + str(file) + ': ' + str(page_Count) + ' pages') fout.write ('Pages in ' + str(file) + ':\t' + str(page_Count) + ' pages\n') except: print('File {} cannot be read'.format(str(file))) logging.error('File cannot be read:\t {}'.format(str(file))) except: logging.error('File is not processed: {}'.format(str(file))) print ('Total number of pages:\t' + str(numPages) + ' pages') fout.write ('Total number of pages:\t' + str(numPages) + ' pages\n')

1条回答

网友

1楼 · 发布于 2024-04-26 22:14:07

我已经解决了部分问题。将参数strict设置为false，这样可以打开比以前更多的文件

从以下位置更改此行： page\u Count=PdfFileReader（PDF）.getNumPages（）到 page\u Count=PdfFileReader（PDF，strict=False）.getNumPages（）

相关问题更多 >

编程相关推荐

热门问题

热门文章