Python在二进制文件中搜索和替换

29 投票

2 回答

79400 浏览

提问于 2025-04-16 00:42

我正在尝试在这个PDF表单文件中搜索并替换一些文本（比如'史密斯，约翰'），这个文件叫做header.fdf，我猜这个文件是以二进制格式处理的：

'%FDF-1.2\n%\xe2\xe3\xcf\xd3\n1 0 obj\n<</FDF<</Fields[<</V(M)/T(PatientSexLabel)>><</V(24-09-1956  53)/T(PatientDateOfBirth)>><</V(Fisher)/T(PatientLastNameLabel)>><</V(CNSL)/T(PatientConsultant)>><</V(28-01-2010 18:13)/T(PatientAdmission)>><</V(134 Field Street\\rBlackburn BB1 1BB)/T(PatientAddressLabel)>><</V(Smith, John)/T(PatientName)>><</V(24-09-1956)/T(PatientDobLabel)>><</V(0123456)/T(PatientRxr)>><</V(01234567891011)/T(PatientNhsLabel)>><</V(John)/T(PatientFirstNameLabel)>><</V(0123456)/T(PatientRxrLabel)>>]>>>>\nendobj\ntrailer\n<</Root 1 0 R>>\n%%EOF\n'

之后

f=open("header.fdf","rb")
s=f.read()
f.close()
s=s.replace(b'PatientName',name)

出现了以下错误：

Traceback (most recent call last):
  File "/home/aj/Inkscape/Med/GAD/gad.py", line 56, in <module>
    s=s.replace(b'PatientName',name)
TypeError: expected an object with the buffer interface

有什么好的方法来做到这一点吗？

文本替换文件格式二进制文件 PDF处理

2 个回答

你必须使用Python 3.X版本。在你的例子中，你没有定义'名字'，这就是问题所在。很可能你把它定义成了一个Unicode字符串：

name = 'blah'

它也需要是一个字节对象：

name = b'blah'

这样做是可以的：

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> f = open('file.txt','rb')
>>> s = f.read()
>>> f.close()
>>> s
b'Test File\r\n'
>>> name = b'Replacement'
>>> s=s.replace(b'File',name)
>>> s
b'Test Replacement\r\n'

在一个字节对象中，替换的参数必须都是字节对象。

回答于 2025-04-16 由 Python大师

分享举报

f = open("header.fdf", "rb")
s = str(f.read())
f.close()
s = s.replace(b'PatientName', name)

或者

f = open("header.fdf", "rb")
s = f.read()
f.close()
s = s.replace(b'PatientName', bytes(name))

可能是后者，因为我觉得你用这种替换方式是无法使用unicode名称的

回答于 2025-04-16 由 Python大师

分享举报

Python在二进制文件中搜索和替换

2 个回答

撰写回答