
2024-04-25 22:54:24 发布

您现在位置:Python中文网/ 问答频道 /正文



     import pandas as pd
import config
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import LabelBinarizer 
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from AlexNet import AlexNet
from preproce import ImageToArrayPreprocessor
from preproce import AspectAwarePreprocessor
from preproce import FCHeadNet
from preproce import HDF5datasetGenerator
from preproce import HDF5DatasetWriter
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator
from tensorflow.python.keras.optimizers import RMSprop
from tensorflow.python.keras.optimizers import SGD
from tensorflow.python.keras.applications import VGG16
from tensorflow.python.keras.layers import Input
from tensorflow.python.keras.models import Model
from imutils import paths
import numpy as np
import argparse
import cv2
import os

"""aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1,
                          height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
                          horizontal_flip=True, fill_mode="nearest")"""
"""print("[INFO] loading images...")
trainPaths = list(paths.list_images(config.IMAGES_PATH))
dataset = pd.read_csv("train.csv")
labels = dataset.iloc[:, 1].values
le = LabelEncoder()
trainLabels = le.fit_transform(labels)

split = train_test_split(trainPaths, trainLabels,
                          test_size=config.NUM_TEST_IMAGES, stratify=trainLabels,
(trainPaths, testPaths, trainLabels, testLabels) = split 

split = train_test_split(trainPaths, trainLabels,
                         test_size=config.NUM_VAL_IMAGES, stratify=trainLabels,random_state=42)
(trainPaths, valPaths, trainLabels, valLabels) = split

datasets = [ ("train", trainPaths, trainLabels, config.TRAIN_HDF5),
             ("val", valPaths, valLabels, config.VAL_HDF5),
             ("test", testPaths, testLabels, config.TEST_HDF5)]

for (dType, paths, labels, outputPath) in datasets: 
    print("[INFO] building {}...".format(outputPath))
    writer = HDF5DatasetWriter((len(paths), 500, 500, 3), outputPath) 
    for (i, (path, label)) in enumerate(zip(paths, labels)): 
        image = cv2.imread(path) 
        image = aap.preprocess(image) 
        writer.add([image], [label])
#aap = AspectAwarePreprocessor(500, 500)
iap = ImageToArrayPreprocessor()
trainGen = HDF5DatasetGenerator(config.TRAIN_HDF5, 8,  preprocessors=[iap], classes=102) 
valGen = HDF5DatasetGenerator(config.VAL_HDF5, 8, preprocessors=[iap], classes=102)

print("[INFO] compiling model...")
opt = RMSprop(lr=0.001),500,3,102)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"]) 
print("[INFO] training head...")

         steps_per_epoch=trainGen.numImages // 8,
         validation_steps=valGen.numImages // 8,
         max_queue_size=8 * 2, verbose=1)
print("[INFO] serializing model..."), overwrite=True) 

tensorflow/core/platform/] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2019-08-23 00:19:47.336560: I tensorflow/core/common_runtime/gpu/] Found device 0 with properties: name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62 pciBusID: 0000:01:00.0 totalMemory: 4.00GiB freeMemory: 3.30GiB 2019-08-23 00:19:47.342432: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0 2019-08-23 00:19:47.900540: I tensorflow/core/common_runtime/gpu/] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-08-23 00:19:47.904687: I tensorflow/core/common_runtime/gpu/] 0 2019-08-23 00:19:47.907033: I tensorflow/core/common_runtime/gpu/] 0: N 2019-08-23 00:19:47.909380: I tensorflow/core/common_runtime/gpu/] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3007 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) 2019-08-23 00:19:48.550001: W tensorflow/core/framework/] Allocation of 822083584 exceeds 10% of system memory. 2019-08-23 00:19:49.089904: W tensorflow/core/framework/] Allocation of 822083584 exceeds 10% of system memory. 2019-08-23 00:19:49.629533: W tensorflow/core/framework/] Allocation of 822083584 exceeds 10% of system memory. 2019-08-23 00:19:50.067994: W tensorflow/core/framework/] Allocation of 822083584 exceeds 10% of system memory. 2019-08-23 00:19:50.523258: W tensorflow/core/framework/] Allocation of 822083584 exceeds 10% of system memory. Epoch 1/75 2019-08-23 00:20:14.632764: I tensorflow/stream_executor/] successfully opened CUDA library cublas64_100.dll locally 2019-08-23 00:20:16.325917: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.410374: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 836.38MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.650565: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 429.27MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.716695: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.22GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.733003: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 637.52MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.782250: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 844.88MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:16.792756: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 429.27MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. 2019-08-23 00:20:25.135977: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 784.00MiB. Current allocation summary follows. 2019-08-23 00:20:25.143913: I tensorflow/core/common_runtime/] Bin (256):
Total Chunks: 104, Chunks in use: 99. 26.0KiB allocated for chunks. 24.8KiB in use in bin. 452B client-requested in use in bin. 2019-08-23 00:20:25.150353: I tensorflow/core/common_runtime/] Bin (512):
Total Chunks: 16, Chunks in use: 14. 8.0KiB allocated for chunks. 7.0KiB in use in bin. 5.3KiB client-requested in use in bin. 2019-08-23 00:20:25.160812: I tensorflow/core/common_runtime/] Bin (1024): Total Chunks: 49, Chunks in use: 49. 61.3KiB allocated for chunks. 61.3KiB in use in bin. 60.1KiB client-requested in use in bin. 2019-08-23 00:20:25.169944: I tensorflow/core/common_runtime/] Bin (2048): Total Chunks: 4, Chunks in use: 4. 13.0KiB allocated for chunks. 13.0KiB in use in bin. 12.8KiB client-requested in use in bin. 2019-08-23 00:20:25.182025: I tensorflow/core/common_runtime/] Bin (4096): Total Chunks: 1, Chunks in use: 0. 6.3KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.192454: I tensorflow/core/common_runtime/] Bin (8192): Total Chunks: 1, Chunks in use: 0. 15.0KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.200847: I tensorflow/core/common_runtime/] Bin (16384):
Total Chunks: 9, Chunks in use: 9. 144.8KiB allocated for chunks. 144.8KiB in use in bin. 144.0KiB client-requested in use in bin. 2019-08-23 00:20:25.209817: I tensorflow/core/common_runtime/] Bin (32768):
Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.219192: I tensorflow/core/common_runtime/] Bin (65536):
Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.228194: I tensorflow/core/common_runtime/] Bin (131072):
Total Chunks: 9, Chunks in use: 9. 1.17MiB allocated for chunks. 1.17MiB in use in bin. 1.16MiB client-requested in use in bin. 2019-08-23 00:20:25.236088: I tensorflow/core/common_runtime/] Bin (262144):
Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.245435: I tensorflow/core/common_runtime/] Bin (524288):
Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.254114: I tensorflow/core/common_runtime/] Bin (1048576): Total Chunks: 8, Chunks in use: 7. 12.25MiB allocated for chunks. 11.22MiB in use in bin. 10.91MiB client-requested in use in bin. 2019-08-23 00:20:25.264209: I tensorflow/core/common_runtime/] Bin (2097152):
Total Chunks: 14, Chunks in use: 14. 42.09MiB allocated for chunks. 42.09MiB in use in bin. 42.09MiB client-requested in use in bin. 2019-08-23 00:20:25.273799: I tensorflow/core/common_runtime/] Bin (4194304):
Total Chunks: 13, Chunks in use: 13. 80.41MiB allocated for chunks. 80.41MiB in use in bin. 77.91MiB client-requested in use in bin. 2019-08-23 00:20:25.285089: I tensorflow/core/common_runtime/] Bin (8388608):
Total Chunks: 13, Chunks in use: 13. 141.14MiB allocated for chunks. 141.14MiB in use in bin. 136.45MiB client-requested in use in bin. 2019-08-23 00:20:25.298520: I tensorflow/core/common_runtime/] Bin (16777216):
Total Chunks: 4, Chunks in use: 4. 112.98MiB allocated for chunks. 112.98MiB in use in bin. 112.98MiB client-requested in use in bin. 2019-08-23 00:20:25.306979: I tensorflow/core/common_runtime/] Bin (33554432):
Total Chunks: 4, Chunks in use: 4. 183.11MiB allocated for chunks. 183.11MiB in use in bin. 183.11MiB client-requested in use in bin. 2019-08-23 00:20:25.315121: I tensorflow/core/common_runtime/] Bin (67108864):
Total Chunks: 1, Chunks in use: 0. 82.18MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.322194: I tensorflow/core/common_runtime/] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin. 2019-08-23 00:20:25.331550: I tensorflow/core/common_runtime/] Bin (268435456): Total Chunks: 3, Chunks in use: 3. 2.30GiB allocated for chunks. 2.30GiB in use in bin. 2.30GiB client-requested in use in bin. 2019-08-23 00:20:25.342419: I tensorflow/core/common_runtime/] Bin for 784.00MiB was 256.00MiB, Chunk State: tensorflow/core/common_runtime/] Sum Total of in-use chunks: 2.87GiB 2019-08-23 00:20:50.049508: I tensorflow/core/common_runtime/] Stats: Limit:
3153697177 InUse: 3086482944 MaxInUse:
3153574400 NumAllocs: 388 MaxAllocSize:

2019-08-23 00:20:50.061236: W tensorflow/core/common_runtime/] **************************************************************************************************__ 2019-08-23 00:20:50.066546: W tensorflow/core/framework/] OP_REQUIRES failed at : Resource exhausted: OOM when allocating tensor with shape[50176,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "", line 80, in max_queue_size=8 * 2, verbose=1) File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\keras\engine\", line 1426, in fit_generator initial_epoch=initial_epoch) File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\keras\engine\", line 191, in model_iteration batch_outs = batch_function(*batch_data) File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\keras\engine\", line 1191, in train_on_batch outputs = self._fit_function(ins) # pylint: disable=not-callable File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\keras\", line 3076, in call run_metadata=self.run_metadata) File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\client\", line 1439, in call run_metadata_ptr) File "C:\Users\aleem\Anaconda3\envs\tensorflowf\lib\site-packages\tensorflow\python\framework\", line 528, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[50176,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node training/RMSprop/gradients/loss/kernel/Regularizer_5/Square_grad/Mul_1}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[{{node ConstantFoldingCtrl/loss/activation_6_loss/broadcast_weights/assert_broadcastable/AssertGuard/Switch_0}}]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Tags: incoreimportforbinusetensorflowcommon
1楼 · 发布于 2024-04-25 22:54:24


解决方案是检查哪个进程正在利用你的GPU。如果您使用的是nvidia GPU,您可以通过nvidia-smi检查进程使用GPU,或者您也可以尝试PS -fA | grep python。这将显示哪个进程正在运行并使用GPU。只需从PID列中获取进程ID并通过命令kill -9 PID终止进程。重新运行训练,这次你的GPU是免费的。我也面临同样的问题,清除GPU对我有帮助。在

  • 注意-所有命令都要在终端中运行。在

相关问题 更多 >