在Tensorflow/Keras中,从https://github.com/pierluigiferrari/ssd_keras运行代码时,使用估计器:ssd300_evaluation。我收到这个错误。
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
这与未解决的问题非常相似:Google Colab Error : Failed to get convolution algorithm.This is probably because cuDNN failed to initialize
关于我要讨论的问题:
Python:3.6.4。
Tensorflow版本:1.12.0。
Keras版本:2.2.4。
CUDA:V10.0版。
cuDNN:第7.4.1.5版。
英伟达GeForce GTX 1080。
我还跑了:
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))
没有错误和问题。
最简单的例子是:
from keras import backend as K
from keras.models import load_model
from keras.optimizers import Adam
from scipy.misc import imread
import numpy as np
from matplotlib import pyplot as plt
from models.keras_ssd300 import ssd_300
from keras_loss_function.keras_ssd_loss import SSDLoss
from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes
from keras_layers.keras_layer_DecodeDetections import DecodeDetections
from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast
from keras_layers.keras_layer_L2Normalization import L2Normalization
from data_generator.object_detection_2d_data_generator import DataGenerator
from eval_utils.average_precision_evaluator import Evaluator
import tensorflow as tf
%matplotlib inline
import keras
keras.__version__
# Set a few configuration parameters.
img_height = 300
img_width = 300
n_classes = 20
model_mode = 'inference'
K.clear_session() # Clear previous models from memory.
model = ssd_300(image_size=(img_height, img_width, 3),
n_classes=n_classes,
mode=model_mode,
l2_regularization=0.0005,
scales=[0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05], # The scales
for MS COCO [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]
aspect_ratios_per_layer=[[1.0, 2.0, 0.5],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5, 3.0, 1.0/3.0],
[1.0, 2.0, 0.5],
[1.0, 2.0, 0.5]],
two_boxes_for_ar1=True,
steps=[8, 16, 32, 64, 100, 300],
offsets=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
clip_boxes=False,
variances=[0.1, 0.1, 0.2, 0.2],
normalize_coords=True,
subtract_mean=[123, 117, 104],
swap_channels=[2, 1, 0],
confidence_thresh=0.01,
iou_threshold=0.45,
top_k=200,
nms_max_output_size=400)
# 2: Load the trained weights into the model.
# TODO: Set the path of the trained weights.
weights_path = 'C:/Users/USAgData/TF SSD
Keras/weights/VGG_VOC0712Plus_SSD_300x300_iter_240000.h5'
model.load_weights(weights_path, by_name=True)
# 3: Compile the model so that Keras won't complain the next time you load it.
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)
model.compile(optimizer=adam, loss=ssd_loss.compute_loss)
dataset = DataGenerator()
# TODO: Set the paths to the dataset here.
dir= "C:/Users/USAgData/TF SSD Keras/VOC/VOCtest_06-Nov-2007/VOCdevkit/VOC2007/"
Pascal_VOC_dataset_images_dir = dir+ 'JPEGImages'
Pascal_VOC_dataset_annotations_dir = dir + 'Annotations/'
Pascal_VOC_dataset_image_set_filename = dir+'ImageSets/Main/test.txt'
# The XML parser needs to now what object class names to look for and in which order to map them to integers.
classes = ['background',
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog',
'horse', 'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor']
dataset.parse_xml(images_dirs=[Pascal_VOC_dataset_images_dir],
image_set_filenames=[Pascal_VOC_dataset_image_set_filename],
annotations_dirs=[Pascal_VOC_dataset_annotations_dir],
classes=classes,
include_classes='all',
exclude_truncated=False,
exclude_difficult=False,
ret=False)
evaluator = Evaluator(model=model,
n_classes=n_classes,
data_generator=dataset,
model_mode=model_mode)
results = evaluator(img_height=img_height,
img_width=img_width,
batch_size=8,
data_generator_mode='resize',
round_confidences=False,
matching_iou_threshold=0.5,
border_pixels='include',
sorting_algorithm='quicksort',
average_precision_mode='sample',
num_recall_points=11,
ignore_neutral_boxes=True,
return_precisions=True,
return_recalls=True,
return_average_precisions=True,
verbose=True)
我有这个错误,我通过从我的系统中卸载所有CUDA和cuDNN版本来修复它。然后我安装了CUDA Toolkit 9.0(没有任何补丁)和cuDNN v7.4.1 for CUDA 9.0。
我看到这个错误消息有三个不同的原因,有不同的解决方案:
一。你有缓存问题
我经常通过关闭python进程、删除
~/.nv
目录(在linux上,rm -rf ~/.nv
)和重新启动python进程来解决这个错误。我不知道为什么会这样。这可能至少在一定程度上与第二种选择有关:2。你已经记不清了
如果图形卡RAM用完,也会显示错误。使用nvidia GPU,您可以使用
nvidia-smi
检查图形卡内存使用情况。这不仅能让你读出你使用了多少GPU RAM(如果你已经接近极限的话,比如6025MiB / 6086MiB
),还能列出哪些进程在使用GPU RAM。如果RAM用完了,则需要重新启动进程(这会释放RAM),然后采用内存占用较少的方法。有几个选择:
如果不与上面的项一起使用,这可能会减慢模型评估的速度,这可能是因为必须交换大数据集才能容纳分配的少量内存。
三。您有不兼容的CUDA、TensorFlow、NVIDIA驱动程序等版本
如果您从未使用过类似的模型,您没有用完VRAM和您的缓存是干净的,我将返回并使用最好的可用安装指南设置CUDA+TensorFlow-我在https://www.tensorflow.org/install/gpu而不是NVIDIA/CUDA站点上的说明获得了最大的成功。
问题是tensorflow 1.10.x plus的新版本与cudnn 7.0.5和cuda 9.0不兼容。最简单的解决方法是将tensorflow降级到1.8.0
pip安装--升级tensorflow gpu==1.8.0
相关问题 更多 >
编程相关推荐