Python:在base64解码时忽略“错误填充”错误

3条回答

网友

1楼 · 编辑于 2024-05-12 23:17:04

只需根据需要添加填充。不过，请注意迈克尔的警告。

b64_string += "=" * ((4 - len(b64_string) % 4) % 4) #ugh

网友

2楼 · 编辑于 2024-05-12 23:17:04

正如在其他响应中所说，base64数据可能会以各种方式损坏。

然而，正如Wikipedia所说，删除填充（base64编码数据末尾的“=”字符）是“无损的”：

From a theoretical point of view, the padding character is not needed, since the number of missing bytes can be calculated from the number of Base64 digits.

所以，如果这真的是base64数据唯一的“错误”，那么可以将填充添加回来。我提出这个方法是为了能够在WeasyPrint中解析“data”url，其中一些url是base64，没有填充：

import base64
import re

def decode_base64(data, altchars=b'+/'):
    """Decode base64, padding being optional.

    :param data: Base64 data as an ASCII byte string
    :returns: The decoded byte string.

    """
    data = re.sub(rb'[^a-zA-Z0-9%s]+' % altchars, b'', data)  # normalize
    missing_padding = len(data) % 4
    if missing_padding:
        data += b'='* (4 - missing_padding)
    return base64.b64decode(data, altchars)

此函数的测试：weasyprint/tests/test_css.py#L68

网友

3楼 · 编辑于 2024-05-12 23:17:04

“不正确的填充”不仅意味着“缺少填充”，而且（信不信由你）“不正确的填充”。

如果建议的“添加填充”方法不起作用，请尝试删除一些尾随字节：

lens = len(strg)
lenx = lens - (lens % 4 if lens % 4 else 4)
try:
    result = base64.decodestring(strg[:lenx])
except etc

更新：在添加填充或从结尾删除可能不正确的字节时，应在删除任何空白之后进行任何修改，否则长度计算将被破坏。

如果您向我们展示一个（简短的）您需要恢复的数据样本，这将是一个好主意。编辑您的问题并复制/粘贴print repr(sample)的结果。

更新2：编码可能是以url安全的方式完成的。如果是这种情况，您将能够在数据中看到减号和下划线字符，并且应该能够使用base64.b64decode(strg, '-_')对其进行解码

如果您在数据中看不到减号和下划线字符，但可以看到加号和斜杠字符，那么您就有其他问题，可能需要添加填充或删除cruft技巧。

如果在数据中看不到减号、下划线、加号和斜杠，则需要确定两个替代字符；它们将是[A-Za-z0-9]中没有的字符。然后，您需要进行实验，看看它们在base64.b64decode()的第2个参数中需要使用的顺序

更新3：如果您的数据是“公司机密”：
（a）你应该事先这么说
（b）我们可以探索其他途径来理解这个问题，它很可能与使用什么字符而不是编码字母表中的+和/或其他格式或无关字符有关。

其中一种方法是检查数据中有哪些非“标准”字符，例如

from collections import defaultdict
d = defaultdict(int)
import string
s = set(string.ascii_letters + string.digits)
for c in your_data:
   if c not in s:
      d[c] += 1
print d

相关问题更多 >

编程相关推荐

热门问题

热门文章