如何使用Python访问Firefox的内部indexedDB文件？

Question

我需要用Python读取Firefox的indexeddb。

我使用slite3这个包来获取indexeddb的内容：

with sqlite3.connect(indexeddb_file) as conn:
    c = conn.cursor()
    c.execute('select * from object_data;')
    rows = c.fetchall()
    for row in rows:
        print row[2]

不过，虽然我知道数据库里的内容是字符串，但它们是以sqlite的二进制大对象的形式存储的。有没有办法从Python中读取这些以大对象形式存储的字符串呢？

我尝试过：

hex()和quote()这两个SQL方法只是把大对象编码成十六进制
当我把大对象写入文件时也遇到了同样的问题

更新

根据Firefox的源代码中提到的indexeddb实现方案，我在Python中实现了部分Firefox的编码方法，用于数据库键的处理。目前我只实现了字符串的部分，但对其他类型的实现会更简单：

BYTE_LENGTH = 8

def hex_to_bin(hex_str):
    """Return binary representation of hexadecimal string."""
    return str(trim_bin(int(hex_str, 16)).zfill(len(hex_str) * 4))

def byte_to_unicode(bin_byte):
    """Return unicode encoding for binary byte."""
    return chr(int(str(bin_byte), 2))

def trim_bin(int_n):
    """Return int num converted to trimmed bin representation."""
    return bin(int_n)[2:]

def decode(key):
    """Return decoded idb key."""
    decoded = key
    m = re.search("[1-9]", key)  # change for non-zero
    if m:
        i = m.start()
        typeoffset = int(key[i])
    else:
        # error
        pass
    data = key[i + 1:]
    if typeoffset is 1:
        # decode number
        pass
    elif typeoffset is 2:
        # decode date
        pass
    elif typeoffset is 3:
        # decode string
        bin_repr = hex_to_bin(data)
        decoded = ""
        for i in xrange(0, len(bin_repr), BYTE_LENGTH):
            byte = bin_repr[i:i + BYTE_LENGTH]
            if byte[0] is '0':
                byte_1 = int(byte, 2) - 1
                decoded += byte_to_unicode(trim_bin(byte_1))
            else:
                byte = byte[2:]
                if byte[1] is '0':
                    byte_127 = int(byte, 2) + 127
                    decoded += byte_to_unicode(trim_bin(byte_127))
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + BYTE_LENGTH])
                elif byte[1] is '1':
                    decoded += byte_to_unicode(byte)
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + BYTE_LENGTH])
                    i += BYTE_LENGTH
                    decoded += byte_to_unicode(bin_repr[i:i + 2])
        return decoded
    elif typeoffset is 4:
        # decode array
        pass
    else:
        # error
        pass
    return decoded

不过，我仍然无法解码indexeddb的数据字段。看起来他们并没有使用像键那样复杂的编码方案，因为当我用UTF-16编码时，可以读取到一些实际值的部分内容。

数据库数据处理 sqlite firefox blob 编码 utf-16 indexeddb

如何使用Python访问Firefox的内部indexedDB文件？

1 个回答

撰写回答