为什么调用multiprocessing.Pool的apply_async时会抛出"'module' object has no attribute XXX"错误?

3 投票
1 回答
2573 浏览
提问于 2025-04-18 18:51

下面是代码。当我把它复制粘贴到命令提示符里时,出现了‘模块’对象没有属性‘func’的错误,但当我把它保存为.py文件,然后执行python test.py时,它就正常工作了。

import multiprocessing
import time

def func(msg):
    for i in xrange(3):
        print msg
        time.sleep(1)



if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    for i in xrange(5):
        msg = "hello %d" %(i)
        pool.apply_async(func, (msg, ))
    pool.close()
    pool.join()
    print "Sub-process(es) done."

有没有人能解释一下在命令提示符和在文件中运行Python代码有什么区别?非常感谢!

1 个回答

6

这个问题发生的原因是,在Windows系统上,func需要被“打包”(也就是序列化)并通过进程间通信(IPC)发送到子进程。为了让子进程能够“解包”这个func,它必须能够从父进程的__main__模块中导入它。在普通的Python脚本中,子进程可以重新导入你的脚本,__main__会包含你在脚本顶部定义的所有函数,所以一切正常。然而,在交互式解释器中,你在解释器中定义的函数不能像在普通脚本中那样简单地从文件中重新导入,因此它们在子进程的__main__中是不存在的。如果你直接使用multiprocessing.Process来重现这个问题,会更清楚一些:

>>> def f():
...  print "HI"
...
>>> import multiprocessing
>>> p = multiprocessing.Process(target=f)
>>> p.start()
>>> Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\python27\lib\multiprocessing\forking.py", line 381, in main
    self = load(from_parent)
  File "C:\python27\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "C:\python27\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\python27\lib\pickle.py", line 1090, in load_global
    klass = self.find_class(module, name)
  File "C:\python27\lib\pickle.py", line 1126, in find_class
    klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'f'

这样就更明显了,pickle找不到模块。如果你在pickle.py中添加一些跟踪代码,你会看到'module'指的是__main__

def load_global(self):
    module = self.readline()[:-1]
    name = self.readline()[:-1]
    print("module {} name {}".format(module, name))  # I added this.
    klass = self.find_class(module, name)
    self.append(klass)

再次运行相同的代码,加上那个额外的打印语句,结果是这样的:

module multiprocessing.process name Process
module __main__ name f
< same traceback as before>

值得注意的是,这个例子在Posix平台上实际上是可以正常工作的,因为在这些平台上使用os.fork()来创建子进程,这意味着在创建Pool之前定义的任何函数在子进程的__main__模块中都是可用的。所以,虽然上面的例子会失败,但这个例子仍然会失败,因为工作函数是在创建Pool之后定义的(也就是在调用os.fork()之后):

>>> import multiprocessing
>>> p = multiprocessing.Pool(2)
>>> def f(a):
...  print(a)
... 
>>> p.apply(f, "hi")
Process PoolWorker-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/multiprocessing/process.py", line 231, in _bootstrap
    self.run()
  File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib64/python2.6/multiprocessing/pool.py", line 57, in worker
    task = get()
  File "/usr/lib64/python2.6/multiprocessing/queues.py", line 339, in get
    return recv()
AttributeError: 'module' object has no attribute 'f'

撰写回答