我是运行Scientific Linux release 6.6(Carbon)的群集计算机上的非根用户。在
我在使用CUDA7.5和CUDNN5的GPU上运行代码时遇到了一些theano崩溃。我使用的是python2.7、theano0.9、keras1.0.7和lasange0.1。在
只有在启用cuDNN的GPU节点上运行程序时,才会发生以下崩溃。代码在CPU和禁用cuDNN的GPU上完成,没有问题。在
Traceback (most recent call last):
File "runner.py", line 306, in <module>
main()
File "runner.py", line 241, in main
queries_exp = __import__(args.exp_model).queries_exp
File "/mnt/nfs2/inf/tjb32/workspace/CNN_EL/nlp-entity-convnet/exp_multi_conv_cosim.py", line 923, in <module>
queries_exp = EntityVectorLinkExp()
File "/mnt/nfs2/inf/tjb32/workspace/CNN_EL/nlp-entity-convnet/exp_multi_conv_cosim.py", line 51, in __init__
self._setup()
File "/mnt/nfs2/inf/tjb32/workspace/CNN_EL/nlp-entity-convnet/exp_multi_conv_cosim.py", line 543, in _setup
on_unused_input='ignore',
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/compile/function.py", line 326, in function
output_keys=output_keys)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 484, in pfunc
output_keys=output_keys)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 1788, in orig_function
output_keys=output_keys).create(
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 1467, in __init__
optimizer_profile = optimizer(fgraph)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 102, in __call__
return self.optimize(fgraph)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 90, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 235, in apply
sub_prof = optimizer.optimize(fgraph)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 90, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 235, in apply
sub_prof = optimizer.optimize(fgraph)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 90, in optimize
ret = self.apply(fgraph, *args, **kwargs)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 2262, in apply
lopt_change = self.process_node(fgraph, node, lopt)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 1825, in process_node
lopt, node)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 1719, in warn_inplace
return NavigatorOptimizer.warn(exc, nav, repl_pairs, local_opt, node)
File "/home/t/tj/tjb32/.local/lib/python2.7/site-packages/theano/gof/opt.py", line 1705, in warn
raise exc
AssertionError
我的。theanorc看起来像这样:
^{pr2}$我的个人资料如下:
export LD_LIBRARY_PATH=/home/t/tj/tjb32/cuda/lib64:$LD_LIBRARY_PATH
export CPATH=/home/t/tj/tjb32/cuda/include:$CPATH
export LIBRARY_PATH=/home/t/tj/tjb32/cuda/lib64:$LD_LIBRARY_PATH
export PATH=/home/t/tj/tjb32/cuda/bin:$PATH
当我查询theano时,返回以下信息,这表明theano正在与CUDA和cuDNN交互。在
Using gpu device 0: Tesla K20m (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5005)
我很确定我已经正确地安装了CUDA和cuDNN,如果有人能建议任何额外的配置步骤,我可能会错过,这是导致cudn崩溃的程序,将不胜感激。在
我还使用CUDA-7.5和cudnn5在Keras中运行DNN。我在家里创建了一个单独的目录
(cuDNN/copy)
,并将所有CuDNN(从nvidia网站获得)文件(.so和.h文件)放在这个目录中。然后我对bashrc中的PATH和LD_库变量进行了适当的更改。我还对.theanorc文件进行了更改。所以DNN对我有用。 我的bashrc就是这样-这就是我的.theanorc的样子:
^{pr2}$不确定这是否是问题所在,但: export LIBRARY_PATH=/home/t/tj/tjb32/cuda/lib64:$LD_U库路径 应该是? export LIBRARY_PATH=/home/t/tj/tjb32/cuda/lib64:$LIBRARY_路径
相关问题 更多 >
编程相关推荐