使用PyPDF2模块合并PDF文件遇到问题
我从一个表格里创建了一个字典,这个字典为每个键提供了一系列的PDF文件路径。如果一个键对应多个值,我想把这些PDF文件合并在一起,并用这个键作为输出文件的名称。但是,当我尝试写出合并后的文件时,出现了属性错误。
'unicode' object has no attribute 'write'.
我的代码是参考了这篇帖子。有人能帮我看看可能出错的地方吗?
import arcpy, os, PyPDF2, shutil
arcpy.env.overwriteOutput = True
gb_xls = r'P:\Records\GIS\__Databases__\MapIndex\_MapSets_Grantors_Verification_Q.xlsx'
gb_gdb_tbl = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_tbl_sort = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_fields = ['Actual_SheetLabel','GBSheetLabel','Image_Path_Filename']
gb_dict = {}
v_list = []
lastkey = -1
lastvalue = ""
rows = sorted(arcpy.da.SearchCursor(gb_gdb_tbl,gb_fields))
for row in rows:
k = row[0]
v = row[2]
if k not in gb_dict:
gb_dict[k] = v
if k == lastkey:
v = str(lastvalue) + ', ' + str(v)
gb_dict[k] = v
lastkey = k
lastvalue = v
merged_file = PyPDF2.PdfFileMerger()
for k,v in gb_dict.items():
new_file = os.path.join(r'D:\GrantorBoxes_Merged_Pdfs',k+'.pdf')
if len(str(v).split(',')) > 1:
for i in [v]:
val = i.split(',')[0]
merged_file.append(PyPDF2.PdfFileReader(val, 'rb'))
merged_file.write(new_file)
else:
shutil.copyfile(v,new_file)
更新:
我有一些不同的代码,使用PyPDF2来合并PDF文件,这段代码可以顺利合并文件而没有错误。现在我的问题是,它合并的文件数量比我预期的要多。我想遍历我的字典,寻找每个键对应多个值(PDF文件)的项,并把这些值合并成一个文件,文件名用这个键来命名。我的循环或缩进可能有问题,但我看不出来。以下是更新后的代码:
import arcpy, os, PyPDF2, shutil
arcpy.env.overwriteOutput = True
gb_xls = r'P:\Records\GIS\__Databases__\MapIndex\_MapSets_Grantors_Verification_Q.xlsx'
gb_gdb_tbl = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_tbl_sort = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_fields = ['Actual_SheetLabel','GBSheetLabel','Image_Path_Filename']
gb_dict = {}
lastkey = -1
lastvalue = ""
rows = sorted(arcpy.da.SearchCursor(gb_gdb_tbl,gb_fields))
for row in rows:
k = row[0]
v = row[2]
if k not in gb_dict:
gb_dict[k] = v
if k == lastkey:
v = str(lastvalue) + ',' + str(v)
gb_dict[k] = v
lastkey = k
lastvalue = v
merger = PyPDF2.PdfFileMerger()
for k,v in gb_dict.items():
v_list = v.split(',')
if len(v_list) > 1:
for i in v_list:
print k,',',i
input = open(i,'rb')
merger.append(input)
output = open(os.path.join(r'D:\GrantorBoxes_Merged_Pdfs',k+'.pdf'), "wb")
merger.write(output)
print output
else:
new_file = os.path.join(r'D:\GrantorBoxes_Merged_Pdfs',k+'.pdf')
shutil.copyfile(str(v),new_file)
1 个回答
0
我之前用PyPDF2来处理PDF文件,但后来发现可以直接使用arcpy的映射模块,这里面有一些处理PDF文档的功能。下面是我写的可用代码:
import arcpy, os, shutil
arcpy.env.overwriteOutput = True
gb_xls = r'P:\Records\GIS\__Databases__\MapIndex\_MapSets_Grantors_Verification_Q.xlsx'
gb_gdb_tbl = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_tbl_sort = r'C:\temp\temp.gdb\_MapSets_Grantors_Verification_Q'
gb_fields = ['Actual_SheetLabel','GBSheetLabel','Image_Path_Filename']
gb_dict = {}
lastkey = -1
lastvalue = ""
rows = sorted(arcpy.da.SearchCursor(gb_gdb_tbl,gb_fields))
for row in rows:
k = row[0]
v = row[2]
if k not in gb_dict:
gb_dict[k] = v
if k == lastkey:
v = str(lastvalue) + ',' + str(v)
gb_dict[k] = v
lastkey = k
lastvalue = v
for k in gb_dict.keys():
val = gb_dict.get(k)
val_list = gb_dict.get(k).split(',')
pdf_path = os.path.join(r'D:\GrantorBoxes_Merged_Pdfs',k + '.pdf')
out_pdf_file = arcpy.mapping.PDFDocumentCreate(os.path.join(r'D:\GrantorBoxes_Merged_Pdfs',k + '.pdf'))
if len(val_list) > 1:
for v in val_list:
print k,v
out_pdf_file.appendPages(v)
print out_pdf_file
out_pdf_file.saveAndClose()
else:
shutil.copyfile(str(val),pdf_path)