Python扩展(Boost.Python与Py++)和dlopen的困惑
我正在用Py++/Boost.Python把一个C++项目封装起来,适用于Windows和Linux。在Windows上,一切都运行得很好,但在Linux上我有点困惑。这个C++项目被构建成一个叫做libsimif的共享库,但我想把它拆分成三个独立的扩展模块。为了简单起见,我只讨论其中的两个,因为第三个的行为是一样的。第一个模块叫做storage,里面定义了一些数据结构。它不依赖于其他两个模块中的任何东西。第二个模块叫做control,它使用了在storage中定义的数据结构。在C++的部分,storage和control的头文件和源文件在完全不同的目录里。我尝试了多种不同的配置来构建这些扩展,但有一点是始终如一的,就是对于storage,我只为storage目录中包含的头文件生成Py++的包装,并且只构建那个目录中的源文件以及Py++生成的源文件。control扩展也是如此。
我现在使用的配置是把libsimif作为库传递给distutils.Extension构造函数。然后在启动Python之前,我需要确保libsimif在LD_LIBRARY_PATH中可以找到。这样我就可以启动Python并导入任一模块(或从它们中导入),一切都按预期工作。以下是这个工作配置的一些示例输出:
>>> import ast.simif.model_io.storage as storage
>>> import ast.simif.model_io.control as control
>>> dir(storage)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']
>>> dir(control)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', '__doc__', '__file__', '__name__', '__package__']
>>> storage.__file__
'ast/simif/model_io/storage.so'
>>> control.__file__
'ast/simif/model_io/control.so'
如你所见,这两个模块都有自己的共享库和独特的符号集。现在我感到困惑的地方来了。在Linux中,我们总是把dlopen的标志设置为RTLD_NOW和RTLD_GLOBAL。如果我这样做,会发生以下情况:
>>> import sys
>>> import DLFCN
>>> sys.setdlopenflags(DLFCN.RTLD_NOW | DLFCN.RTLD_GLOBAL)
>>> import ast.simif.model_io.storage as storage
>>> import ast.simif.model_io.control as control
__main__:1: RuntimeWarning: to-Python converter for DiscreteStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for PulseStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::Link already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtStore::RtData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SerialStore::FrameData already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SharedMemoryBuilder already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SharedMemoryDeleter already registered; second conversion method ignored.
>>> dir(storage)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']
>>> dir(control)
['DiscreteStore', 'PulseStore', 'RtStore', 'SerialStore', 'SharedMemoryBuilder', 'SharedMemoryDeleter', '__doc__', '__file__', '__name__', '__package__']
>>> storage.__file__
'ast/simif/model_io/storage.so'
>>> control.__file__
'ast/simif/model_io/control.so'
所以,storage导入正常,但control却抱怨有很多重复的注册。然后在检查模块时,control完全错误。就好像它试图导入storage两次,尽管file报告了正确的共享库。如果我改变导入顺序,先导入control再导入storage,会发生以下情况:
>>> import sys
>>> import DLFCN
>>> sys.setdlopenflags(DLFCN.RTLD_NOW | DLFCN.RTLD_GLOBAL)
>>> import ast.simif.model_io.control as control
>>> dir(control)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', '__doc__', '__file__', '__name__', '__package__']
>>> import ast.simif.model_io.storage as storage
__main__:1: RuntimeWarning: to-Python converter for DiscreteController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for PulseController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for RtController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SerialController already registered; second conversion method ignored.
__main__:1: RuntimeWarning: to-Python converter for SpaceWireController already registered; second conversion method ignored.
>>> dir(storage)
['DiscreteController', 'ModelIoController', 'PulseController', 'RtController', 'SerialController', 'SpaceWireController', 'SpaceWireStore', '__doc__', '__file__', '__name__', '__package__']
类似的行为,但现在storage的导入完全混乱。有没有人明白这里发生了什么?
我使用的是:
- x64 Python 2.6.6 在 x64 RHEL6上,Gcc版本4.4.6
- x64 Python 2.6.5 在 x64 RHEL5上,Gcc版本4.1.2
1 个回答
原来这是因为在使用Py++的balanced_split_module时,Boost.Python的注册代码生成方式有点特别。balanced_split_module的作用是把所有的注册代码分成固定数量的源文件,每个文件都有自己的注册函数。这些源文件的命名方式是用扩展名加上生成的文件编号,比如_.cpp,但问题在于,这些文件里的实际函数并没有扩展名,而是简单的register_1()、register_2()等等。当你只导入一个模块或者不把模块的符号设为全局时,这样做没什么问题。但是如果你设置了RTLD_GLOBAL,第一次导入模块是成功的,但之后的所有模块都会调用最开始导入模块时加载的注册函数。