Python可以序列化lambda函数吗?

83 投票
6 回答
88972 浏览
提问于 2025-04-18 17:33

我在很多讨论中看到,Python的 pickle/cPickle 不能处理lambda函数。不过,下面这段代码在Python 2.7.6中却能正常运行:

import cPickle as pickle

if __name__ == "__main__":
    s = pickle.dumps(lambda x, y: x+y)
    f = pickle.loads(s)
    assert f(3,4) == 7

那么这是怎么回事呢?或者说,处理lambda函数时有什么限制呢?

[编辑] 我想我知道这段代码为什么能运行了。我忘了(抱歉!)我是在使用无栈Python,它有一种叫做tasklets的微线程来执行函数。这些tasklets可以被暂停、序列化(也就是pickled)、反序列化(unpickled)后继续执行,所以我猜(我在无栈的邮件列表上问过)它也提供了一种序列化函数体的方法。

6 个回答

2

对我来说(在Windows 10和Python 3.7上),有效的方法是传递一个普通的函数,而不是使用lambda函数:

def merge(x):
    return Image.merge("RGB", x.split()[::-1])

transforms.Lambda(merge)

而不是:

transforms.Lambda(lambda x: Image.merge("RGB", x.split()[::-1]))

不需要用到dill或cPickle。

2

虽然这可能很明显,但我想再提供一个可能的解决方案。你可能知道,lambda函数就是一种没有名字的函数声明。如果你用的lambda函数不多,而且只用一次,这样不会让你的代码显得杂乱,你可以给你的lambda函数起个名字,然后像这样传递它的名字(不加括号):

import cPickle as pickle

def addition(x, y):
    return x+y


if __name__ == "__main__":
    s = pickle.dumps(addition)
    f = pickle.loads(s)
    assert f(3,4) == 7

给函数起名字也能让代码更容易理解,而且你就不需要像Dill这样的额外依赖。不过,只有在这样做的好处大于增加的代码复杂度时,才这样做。

34

不,Python不能对lambda函数进行序列化:

>>> import cPickle as pickle
>>> s = pickle.dumps(lambda x,y: x+y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle function objects

不太清楚你做了什么能成功...

49

Python可以对lambda函数进行序列化。我们将分别讨论Python 2和3,因为这两个版本的序列化实现方式不同。

  • Python 3.6

在Python 3中,没有叫做cPickle的模块。我们有pickle,但它默认不支持对lambda函数的序列化。让我们看看它的调度表:

>> import pickle
>> pickle.Pickler.dispatch_table
<member 'dispatch_table' of '_pickle.Pickler' objects>

等等。我试着查找pickledispatch_table,而不是_pickle_pickle是pickle的一个更快的C语言实现。但我们还没有导入它!如果可用,这个C实现会在纯Python的pickle模块结束时自动导入。

# Use the faster _pickle if possible
try:
    from _pickle import (
        PickleError,
        PicklingError,
        UnpicklingError,
        Pickler,
        Unpickler,
        dump,
        dumps,
        load,
        loads
    )
except ImportError:
    Pickler, Unpickler = _Pickler, _Unpickler
    dump, dumps, load, loads = _dump, _dumps, _load, _loads

我们仍然面临在Python 3中序列化lambda函数的问题。答案是你不能使用原生的pickle_pickle。你需要导入dillcloudpickle,并使用它们来代替原生的pickle模块。

>> import dill
>> dill.loads(dill.dumps(lambda x:x))
<function __main__.<lambda>>
  • Python 2.7

pickle使用pickle注册表,这实际上就是一个类型到用于序列化(即“打包”)该类型对象的函数的映射。你可以把pickle注册表看作:

>> pickle.Pickler.dispatch

{bool: <function pickle.save_bool>,
 instance: <function pickle.save_inst>,
 classobj: <function pickle.save_global>,
 float: <function pickle.save_float>,
 function: <function pickle.save_global>,
 int: <function pickle.save_int>,
 list: <function pickle.save_list>,
 long: <function pickle.save_long>,
 dict: <function pickle.save_dict>,
 builtin_function_or_method: <function pickle.save_global>,
 NoneType: <function pickle.save_none>,
 str: <function pickle.save_string>,
 tuple: <function pickle.save_tuple>,
 type: <function pickle.save_global>,
 unicode: <function pickle.save_unicode>}

为了序列化自定义类型,Python提供了copy_reg模块来注册我们的函数。你可以在这里了解更多信息。默认情况下,copy_reg模块支持序列化以下额外类型:

>> import copy_reg
>> copy_reg.dispatch_table

{code: <function ipykernel.codeutil.reduce_code>,
 complex: <function copy_reg.pickle_complex>,
 _sre.SRE_Pattern: <function re._pickle>,
 posix.statvfs_result: <function os._pickle_statvfs_result>,
 posix.stat_result: <function os._pickle_stat_result>}

现在,lambda函数的类型是types.FunctionType。然而,这种类型的内置函数function: <function pickle.save_global>无法序列化lambda函数。因此,所有第三方库,比如dillcloudpickle等,都会重写内置方法,以一些额外的逻辑来序列化lambda函数。让我们导入dill,看看它是怎么做的。

>> import dill
>> pickle.Pickler.dispatch

{_pyio.BufferedReader: <function dill.dill.save_file>,
 _pyio.TextIOWrapper: <function dill.dill.save_file>,
 _pyio.BufferedWriter: <function dill.dill.save_file>,
 _pyio.BufferedRandom: <function dill.dill.save_file>,
 functools.partial: <function dill.dill.save_functor>,
 operator.attrgetter: <function dill.dill.save_attrgetter>,
 operator.itemgetter: <function dill.dill.save_itemgetter>,
 cStringIO.StringI: <function dill.dill.save_stringi>,
 cStringIO.StringO: <function dill.dill.save_stringo>,
 bool: <function pickle.save_bool>,
 cell: <function dill.dill.save_cell>,
 instancemethod: <function dill.dill.save_instancemethod0>,
 instance: <function pickle.save_inst>,
 classobj: <function dill.dill.save_classobj>,
 code: <function dill.dill.save_code>,
 property: <function dill.dill.save_property>,
 method-wrapper: <function dill.dill.save_instancemethod>,
 dictproxy: <function dill.dill.save_dictproxy>,
 wrapper_descriptor: <function dill.dill.save_wrapper_descriptor>,
 getset_descriptor: <function dill.dill.save_wrapper_descriptor>,
 member_descriptor: <function dill.dill.save_wrapper_descriptor>,
 method_descriptor: <function dill.dill.save_wrapper_descriptor>,
 file: <function dill.dill.save_file>,
 float: <function pickle.save_float>,
 staticmethod: <function dill.dill.save_classmethod>,
 classmethod: <function dill.dill.save_classmethod>,
 function: <function dill.dill.save_function>,
 int: <function pickle.save_int>,
 list: <function pickle.save_list>,
 long: <function pickle.save_long>,
 dict: <function dill.dill.save_module_dict>,
 builtin_function_or_method: <function dill.dill.save_builtin_method>,
 module: <function dill.dill.save_module>,
 NotImplementedType: <function dill.dill.save_singleton>,
 NoneType: <function pickle.save_none>,
 xrange: <function dill.dill.save_singleton>,
 slice: <function dill.dill.save_slice>,
 ellipsis: <function dill.dill.save_singleton>,
 str: <function pickle.save_string>,
 tuple: <function pickle.save_tuple>,
 super: <function dill.dill.save_functor>,
 type: <function dill.dill.save_type>,
 weakcallableproxy: <function dill.dill.save_weakproxy>,
 weakproxy: <function dill.dill.save_weakproxy>,
 weakref: <function dill.dill.save_weakref>,
 unicode: <function pickle.save_unicode>,
 thread.lock: <function dill.dill.save_lock>}

现在,让我们尝试序列化一个lambda函数。

>> pickle.loads(pickle.dumps(lambda x:x))
<function __main__.<lambda>>

成功了!!

在Python 2中,我们有两个版本的pickle -

import pickle # pure Python version
pickle.__file__ # <install directory>/python-2.7/lib64/python2.7/pickle.py

import cPickle # C extension
cPickle.__file__ # <install directory>/python-2.7/lib64/python2.7/lib-dynload/cPickle.so

现在,让我们尝试用C实现的cPickle来序列化lambda。

>> import cPickle
>> cPickle.loads(cPickle.dumps(lambda x:x))
TypeError: can't pickle function objects

出了什么问题?让我们看看cPickle的调度表。

>> cPickle.Pickler.dispatch_table
AttributeError: 'builtin_function_or_method' object has no attribute 'dispatch_table'

picklecPickle的实现是不同的。导入dill只会让Python版本的pickle工作。使用pickle而不是cPickle的缺点是,它的速度可能比cPickle1000倍

希望这些能解答你的疑问。

95

是的,Python可以对lambda函数进行序列化(也就是“腌制”),但前提是你需要有一些东西来使用copy_reg来注册如何对lambda函数进行序列化。这个叫dill的包会在你import dillcopy_reg加载到序列化注册表中。

Python 2.7.8 (default, Jul 13 2014, 02:29:54) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import dill  # the code below will fail without this line
>>> 
>>> import pickle
>>> s = pickle.dumps(lambda x, y: x+y)
>>> f = pickle.loads(s)
>>> assert f(3,4) == 7
>>> f
<function <lambda> at 0x10aebdaa0>

你可以在这里获取dill: https://github.com/uqfoundation

撰写回答