cuda lstm未指定的启动失败错误

2024-05-15 12:52:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有Nvidia GTX 1050卡,我的cuda版本是10.1,我有cuDNN 7.6.5,每当我尝试运行LSTM单元时,都会出现大量错误

这是我的密码:

model = Sequential()
model.add(LSTM(64, input_shape=(x_train.shape[1], x_train.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(optimizer='adam', loss='mse')


model.fit(x_train, y, epochs=5, batch_size=16)

这是我的tensorflow版本和完整回溯:

In [2]: tf.__version__
Out[2]: '2.3.0'

回溯:

 Epoch 1/100
    2020-09-04 15:27:30.033120: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
    2020-09-04 15:27:31.436246: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
     27255/261088 [==>...........................] - ETA: 51:45 - loss: 0.01302020-09-04 15:33:38.188521: E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_INTERNAL_ERROR
    in tensorflow/stream_executor/cuda/cuda_dnn.cc(1892): 'cudnnRNNForwardTraining( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, input_desc.handles(), input_data.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), rnn_desc.params_handle(), params.opaque(), output_desc.handles(), output_data->opaque(), output_h_desc.handle(), output_h_data->opaque(), output_c_desc.handle(), output_c_data->opaque(), workspace.opaque(), workspace.size(), reserve_space.opaque(), reserve_space.size())'
    2020-09-04 15:33:38.191709: E tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
    2020-09-04 15:33:38.273883: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:220] Unexpected Event status: 1
    2020-09-04 15:33:38.256027: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQU

Tags: addinputoutputdatastreammodeltensorflowdesc
1条回答
网友
1楼 · 发布于 2024-05-15 12:52:02

一次向模型发送多少数据? 在我看来,您需要调整批次大小。在我看来,你一次向gpu输入的数据太多,导致cuda崩溃。你的序列有多大?gpu的内存分配是多少?但是,如果没有关于数据的更多信息以及cuda和cudnn是否正确安装,则很难提供更清晰的解决方案

相关问题 更多 >