在Python中读取和解析二进制文件数据

27 投票

3 回答

106304 浏览

提问于 2025-04-16 05:33

我想要逐字节读取一个文件，并检查每个字节的最后一位是否被设置为1：

#!/usr/bin/python

def main():
    fh = open('/tmp/test.txt', 'rb')
    try:
        byte = fh.read(1)
        while byte != "":
            if (int(byte,16) & 0x01) is 0x01:
                print 1
            else:
                print 0
            byte = fh.read(1)
    finally:
        fh.close

    fh.close()

if __name__ == "__main__":
        main()

我遇到的错误是：

Traceback (most recent call last):
  File "./mini_01.py", line 21, in <module>
    main()
  File "./mini_01.py", line 10, in main
    if (int(byte,16) & 0x01) is 0x01:
ValueError: invalid literal for int() with base 16: '\xaf'

有没有人知道怎么解决这个问题？我尝试过使用struct和binascii模块，但没有成功。

数据解析二进制文件字节读取 struct模块

3 个回答

一种方法：

import array

filebytes= array.array('B')
filebytes.fromfile(open("/tmp/test.txt", "rb"))
if all(i & 1 for i in filebytes):
    # all file bytes are odd

另一种方法：

fobj= open("/tmp/test.txt", "rb")

try:
    import functools
except ImportError:
    bytereader= lambda: fobj.read(1)
else:
    bytereader= functools.partial(fobj.read, 1)

if all(ord(byte) & 1 for byte in iter(bytereader, '')):
    # all bytes are odd
fobj.close()

回答于 2025-04-16 由 Python大师

分享举报

试试使用 bytearray 类型（Python 2.6 及以后版本），它更适合处理字节数据。你的 try 代码块可以这样写：

ba = bytearray(fh.read())
for byte in ba:
    print byte & 1

或者你可以创建一个结果列表：

low_bit_list = [byte & 1 for byte in bytearray(fh.read())]

这样做的原因是，当你索引一个 bytearray 时，你会得到一个整数（范围是0到255），而如果你只是从文件中读取一个字节，你得到的会是一个单字符的字符串，这样你就需要用 ord 函数把它转换成整数。

如果你的文件太大，无法轻松放进内存（不过我猜它可能不是），那么可以使用 mmap 来从缓冲区创建 bytearray：

import mmap
m = mmap.mmap(fh.fileno(), 0, access=mmap.ACCESS_READ)
ba = bytearray(m)

回答于 2025-04-16 由 Python大师

分享举报

你想用 ord 来代替 int：

if (ord(byte) & 0x01) == 0x01:

回答于 2025-04-16 由 Python大师

分享举报

在Python中读取和解析二进制文件数据

3 个回答

撰写回答