用于CNN的ImageDataGenerator(),输入和输出为图像

2024-04-25 16:44:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在寻找一张类似以下内容的培训地图:

灰度图像->;彩色图像

但是由于明显的原因,数据集不能全部以X和Y的形式加载到ram中

我查找了ImageDataGenerator()库,但它没有给我一个明确的答案,让它在这里工作

总结:

Input Shape = (2048, 2048, 1)

Output Shape = (2048, 2048, 2)

Training Dataset = 17,000 images

Validation Dataset = 1,000 images

以下是我试图训练的模型的结构:

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 2048, 2048,  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 2048, 2048, 1 160         input_1[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu (LeakyReLU)         (None, 2048, 2048, 1 0           conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 2048, 2048, 3 4640        leaky_re_lu[0][0]                
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 2048, 2048, 3 0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 2048, 2048, 3 128         leaky_re_lu_1[0][0]              
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 1024, 1024, 3 0           batch_normalization[0][0]        
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 1024, 1024, 6 18496       max_pooling2d[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU)       (None, 1024, 1024, 6 0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 1024, 1024, 6 256         leaky_re_lu_2[0][0]              
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 512, 512, 64) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 512, 512, 128 73856       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU)       (None, 512, 512, 128 0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 512, 512, 128 512         leaky_re_lu_3[0][0]              
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 512, 512, 256 295168      batch_normalization_2[0][0]      
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU)       (None, 512, 512, 256 0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 512, 512, 256 1024        leaky_re_lu_4[0][0]              
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D)    (None, 1024, 1024, 2 0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 1024, 1024, 1 295040      up_sampling2d[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU)       (None, 1024, 1024, 1 0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 1024, 1024, 1 512         leaky_re_lu_5[0][0]              
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)  (None, 2048, 2048, 1 0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 2048, 2048, 6 73792       up_sampling2d_1[0][0]            
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU)       (None, 2048, 2048, 6 0           conv2d_6[0][0]                   
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 2048, 2048, 6 0           leaky_re_lu_6[0][0]              
                                                                 input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 2048, 2048, 6 37504       concatenate[0][0]                
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU)       (None, 2048, 2048, 6 0           conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 2048, 2048, 6 256         leaky_re_lu_7[0][0]              
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 2048, 2048, 3 18464       batch_normalization_5[0][0]      
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU)       (None, 2048, 2048, 3 0           conv2d_8[0][0]                   
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 2048, 2048, 2 578         leaky_re_lu_8[0][0]              
==================================================================================================
Total params: 820,386
Trainable params: 819,042
Non-trainable params: 1,344
__________________________________________________________________________________________________

Tags: renonebatchmaxnormalizationshapeuplu
2条回答

这将是最简单的自定义训练循环

def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)

这里,我的数据迭代器在所有彩色图片中循环,但在传递到模型之前,它已转换为灰度。然后,将模型的RGB输出与原始RGB图像进行比较。您必须在flow_from_directory中使用参数class_mode=None。我使用tf.image.rgb_to_grayscale在灰度和RGB之间进行转换

完整示例:

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
import os

os.chdir(r'catsanddogs')

generator = tf.keras.preprocessing.image.ImageDataGenerator()
iterator = generator.flow_from_directory(
    target_size=(32, 32),
    directory='.',
    batch_size=4,
    class_mode=None)

encoder = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(32, 32, 1)),
    tf.keras.layers.Dense(32),
    tf.keras.layers.Dense(16)
])

decoder = tf.keras.Sequential([
    tf.keras.layers.Dense(32, input_shape=[16]),
    tf.keras.layers.Dense(32 * 32 * 3),
    tf.keras.layers.Reshape([32, 32, 3])
])


autoencoder = tf.keras.Sequential([encoder, decoder])

loss_object = tf.losses.BinaryCrossentropy()

reconstruction_loss = tf.metrics.Mean(name='reconstruction_loss')

optimizer = tf.optimizers.Adam()


def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)


if __name__ == '__main__':
    template = 'Epoch {:2} Reconstruction Loss {:.4f}'
    for epoch in range(50):
        reconstruction_loss.reset_states()
        for input_batches in iterator:
            reconstruct(input_batches)
        print(template.format(epoch + 1, reconstruction_loss.result()))

我将用示例代码发布我的评论:

您需要使用批处理来训练模型。例如,如果希望在一个历元中使用500个图像,则可以每批处理50个图像,而改为10个历元。这样,您只需在内存中加载50个图像。您必须设置批大小并将shuffle设置为True,以便批具有不同的图像

如果您的图像位于目录中,则上面的注释将转换为以下内容:

from keras_preprocessing.image import ImageDataGenerator
preprocessing_images = ImageDataGenerator()

train_generator = preprocessing_images.flow_from_directory(
        train_path,
        target_size=target_size,
        batch_size=50,
        class_mode="categorical", # classes are provided in categorical format for a 2-unit output layer
        shuffle=True,
        color_mode="grayscale",
        seed=1234567890
    )

这里有一个生成器,它可以生成50幅图像。我没有为您的具体问题指定参数,您需要更改目标大小和所有这些内容。请注意,这是一个无限的生成器,它将返回无限多个批次和50个图像。您可以将其包装到tf.data.Dataset或指定每个历元的步骤。有几种方法。我希望有帮助。如果你还有问题,我会详细说明答案(我现在很忙)

相关问题 更多 >