Python pickle在类内部似乎失效,但在命令行脚本中正常

2 投票
2 回答
630 浏览
提问于 2025-04-16 06:47

我一直在尝试从数据库中解压一些字典数据。我现在使用的是marshal模块,但我还是想知道为什么pickle在反序列化某些数据时会遇到这么大的困难。下面是一个命令行的Python会话,展示了我想做的事情:

>>> a = {'service': 'amazon', 'protocol': 'stream', 'key': 'lajdfoau09424jojf.flv'}
>>> import pickle; import base64
>>> pickled = base64.b64encode(pickle.dumps(a))
>>> pickled
'KGRwMApTJ3Byb3RvY29sJwpwMQpTJ3N0cmVhbScKcDIKc1Mna2V5JwpwMwpTJ2xhamRmb2F1MDk0MjRqb2pmLmZsdicKcDQKc1Mnc2VydmljZScKcDUKUydhbWF6b24nCnA2CnMu'
>>> unpickled = pickle.loads(base64.b64decode(pickled))
>>> unpickled
{'protocol': 'stream', 'service': 'amazon', 'key': 'lajdfoau09424jojf.flv'}
>>> unpickled['service']
'amazon'

这个方法运行得很好,但当我在一个类的工厂方法中尝试时,pickle.loads部分似乎出错了。我尝试加载的字符串和上面一样都是用相同的方法进行序列化的。我甚至尝试复制上面命令行会话中序列化的字符串,结果还是没有成功。以下是我这次尝试的代码:

class Resource:

    _service = 'unknown'
    _protocol = 'unknown'
    _key = 'unknown'

    '''
    Factory method that creates an appropriate instance of one of Resource’s subclasses based on 
    the type of data provided (the data being a serialized dictionary with at least the keys 'service', 
    'protocol', and 'key'). 
    @param resource_data (string) -- the data used to create the new Resource instance. 
    '''
    @staticmethod
    def resource_factory(resource_data):
        # Unpack the raw resource data and then create the appropriate Resource instance and return. 
        resource_data = "KGRwMApTJ3Byb3RvY29sJwpwMQpTJ3N0cmVhbScKcDIKc1Mna2V5JwpwMwpTJ2xhamRmb2F1MDk0MjRqb2pmLmZsdicKcDQKc1Mnc2VydmljZScKcDUKUydhbWF6b24nCnA2CnMu" #hack to just see if we can unpickle this string
        logging.debug("Creating resource: " + resource_data)
        unencoded = base64.b64decode(resource_data)
        logging.debug("Unencoded is: " + unencoded)
        unpacked = pickle.loads(unencoded)
        logging.debug("Unpacked: " + unpacked)
        service = unpacked['service']
        protocol = unpacked['protocol']
        key = unpacked['key']

        if (service == 'amazon'):
            return AmazonResource(service=service, protocol=protocol, key=key)
        elif (service == 'fs'):
            return FSResource(service=service, protocol=protocol, key=key)

2 个回答

0

你的代码可以运行。你是怎么测试它的呢?

import logging
import base64
import pickle
class Resource:
    @staticmethod
    def resource_factory(resource_data):
        resource_data = "KGRwMApTJ3Byb3RvY29sJwpwMQpTJ3N0cmVhbScKcDIKc1Mna2V5JwpwMwpTJ2xhamRmb2F1MDk0MjRqb2pmLmZsdicKcDQKc1Mnc2VydmljZScKcDUKUydhbWF6b24nCnA2CnMu" #hack to just see if we can unpickle this string
        # logging.debug("Creating resource: " + resource_data)
        unencoded = base64.b64decode(resource_data)
        # logging.debug("Unencoded is: " + unencoded)
        unpacked = pickle.loads(unencoded)
        logging.debug("Unpacked: " + repr(unpacked))
        service = unpacked['service']
        protocol = unpacked['protocol']
        key = unpacked['key']

logging.basicConfig(level=logging.DEBUG)
Resource.resource_factory('')

得到的结果是

# DEBUG:root:Unpacked: {'protocol': 'stream', 'service': 'amazon', 'key': 'lajdfoau09424jojf.flv'}
0

我通过简化一些内容并在Django中调试,最终解决了这个问题。主要的问题在于Resource类本身有一些错误,导致resource_factory方法无法正确完成。首先,我试图把一个字符串和一个字典拼接在一起,这就出错了。此外,在类的其他地方,我在引用实例变量时,像_service、_protocol和key时没有加上'_'(就是拼写错误)。

有趣的是,当我在Django的自定义字段结构中使用这段代码时,错误被捕捉到了,我并没有看到任何实际的错误信息来指示问题。调试信息提示是loads出问题了,但实际上是调试信息本身和后面的一些代码出了问题。当我尝试用模型属性来实现这个功能,而不是使用自定义模型字段来保存数据时,错误信息就正确地打印出来了,这让我能够快速找到问题。

撰写回答