获取hex文件并提取d

2条回答

网友

1楼 · 编辑于 2024-06-06 20:04:37

给定文件大小，您可能希望将所有内容加载到内存中（将数据保存为字节），然后使用正则表达式提取页眉和页脚之间的部分，例如：

import binascii
import re

header = binascii.unhexlify('000100a0')
footer = binascii.unhexlify('00000000000')

with open('hexfile', 'rb') as fin:
    raw_data = fin.read()

data = re.search('{}(.*?){}'.format(re.escape(header), re.escape(footer)), raw_data).group(1)

网友

2楼 · 编辑于 2024-06-06 20:04:37

如果文件足够小，可以将其加载到内存中，则可以将其视为常规字符串，并使用find方法（请参见here）导航文件。

让我们来看看更糟糕的情况：您不能保证您的头将是文件中的第一件事，而且您可能有多个正文（多个<header><body><footer>块）我创建了一个名为bindata.txt的文件，其中包含以下内容：

ABCD000100a0AAAAAA000000000000ABCDABCD000100a0BBBBBB000000000000ABCD

好的，有两个实体，第一个是AAAAAA，第二个是BBBBBB，还有一些垃圾在开头、中间和结尾（ABCD在第一个页眉之前，ABCDABCD在第二个页眉之前，ABCD在第二个页脚之后）

玩str对象的find方法和索引，下面是我想到的：

header = "000100a0"
footer = "00000000000"

with open('bindata.txt', 'r') as f:
    data = f.read()
    print "Data: %s" % data
    header_index = data.find(header, 0)
    footer_index = data.find(footer, 0)
    if header_index >= 0 and footer_index >= header_index:
        print "Found header at %s and footer at %s" \
              % (header_index, footer_index)
        body = data[header_index + len(header): footer_index]
        while body is not None:
            print "body: %s" % body
            header_index = data.find(header,\
                                     footer_index + len(footer))
            footer_index = data.find(footer,\
                                     footer_index + len(footer) + len(header) )
            if header_index >= 0 and footer_index >= header_index:
                print "Found header at %s and footer at %s" \
                       % (header_index, footer_index)
                body = data[header_index + len(header): footer_index]
            else:
                body = None

结果是：

Data: ABCD000100a0AAAAAA000000000000ABCDABCD000100a0BBBBBB000000000000ABCD
Found header at 4 and footer at 18
body: AAAAAA
Found header at 38 and footer at 52
body: BBBBBB

如果您的文件太大而无法保存在内存中，我认为最好的方法是逐字节读取文件，并创建两个函数来查找页眉结束位置和页脚开始使用文件的seek和tell方法。

编辑：

根据OP的要求，方法不必使用hexlify（使用原始二进制）和seek-and-tell：

import os
import binascii
import mmap

header = binascii.unhexlify("000100a0")
footer = binascii.unhexlify("0000000000")
sample = binascii.unhexlify("ABCD"
                "000100a0AAAAAA000000000000"
                "ABCDABCD"
                "000100a0BBBBBB000000000000"
                "ABCD")

# Create the sample file:
with open("sample.data", "wb") as f:
    f.write(sample)

# sample done. Now we have a REAL binary data in sample.data

with open('sample.data', 'rb') as f:
    print "Data: %s" % binascii.hexlify(f.read())
    mm = mmap.mmap(f.fileno(), 0, prot=mmap.PROT_READ)
    current_offset = 0
    header_index = mm.find(header, current_offset)
    footer_index = mm.find(footer, current_offset + len(header))
    if header_index >= 0 and footer_index > header_index:
        print "Found header at %s and footer at %s"\
              % (header_index, footer_index)
        mm.seek(header_index + len(header))
        body = mm.read(footer_index - mm.tell())
        while body is not None:
            print "body: %s" % binascii.hexlify(body)
            current_offset = mm.tell()
            header_index = mm.find(header, current_offset + len(footer))
            footer_index = mm.find(footer, current_offset + len(footer) + len(header))
            if header_index >= 0 and footer_index > header_index:
                print "Found header at %s and footer at %s"\
                    % (header_index, footer_index)
                mm.seek(header_index + len(header))
                body = mm.read(footer_index - mm.tell())
            else:
                body = None

此方法生成以下输出：

Data: abcd000100a0aaaaaa000000000000abcdabcd000100a0bbbbbb000000000000abcd
Found header at 2 and footer at 9
body: aaaaaa
Found header at 19 and footer at 26
body: bbbbbb

注意，我使用了Python的mmap模块来帮助移动文件。请看一下它的documentation。此外，本例的第一部分包含一些数据，用于在sample.data中创建实际的二进制文件。块的执行：

# Create the sample file:
with open("sample.data", "wb") as f:
    f.write(sample)

生成以下（真正可读的）文件：

borrajax@borrajax:~/Documents/Tests$ cat ./sample.data 
�������ͫ�������

相关问题更多 >

编程相关推荐

热门问题

热门文章