用于CNN的ImageDataGenerator（），输入和输出为图像

Model: "functional_1" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(None, 2048, 2048, 0 __________________________________________________________________________________________________ conv2d (Conv2D) (None, 2048, 2048, 1 160 input_1[0][0] __________________________________________________________________________________________________ leaky_re_lu (LeakyReLU) (None, 2048, 2048, 1 0 conv2d[0][0] __________________________________________________________________________________________________ conv2d_1 (Conv2D) (None, 2048, 2048, 3 4640 leaky_re_lu[0][0] __________________________________________________________________________________________________ leaky_re_lu_1 (LeakyReLU) (None, 2048, 2048, 3 0 conv2d_1[0][0] __________________________________________________________________________________________________ batch_normalization (BatchNorma (None, 2048, 2048, 3 128 leaky_re_lu_1[0][0] __________________________________________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 1024, 1024, 3 0 batch_normalization[0][0] __________________________________________________________________________________________________ conv2d_2 (Conv2D) (None, 1024, 1024, 6 18496 max_pooling2d[0][0] __________________________________________________________________________________________________ leaky_re_lu_2 (LeakyReLU) (None, 1024, 1024, 6 0 conv2d_2[0][0] __________________________________________________________________________________________________ batch_normalization_1 (BatchNor (None, 1024, 1024, 6 256 leaky_re_lu_2[0][0] __________________________________________________________________________________________________ max_pooling2d_1 (MaxPooling2D) (None, 512, 512, 64) 0 batch_normalization_1[0][0] __________________________________________________________________________________________________ conv2d_3 (Conv2D) (None, 512, 512, 128 73856 max_pooling2d_1[0][0] __________________________________________________________________________________________________ leaky_re_lu_3 (LeakyReLU) (None, 512, 512, 128 0 conv2d_3[0][0] __________________________________________________________________________________________________ batch_normalization_2 (BatchNor (None, 512, 512, 128 512 leaky_re_lu_3[0][0] __________________________________________________________________________________________________ conv2d_4 (Conv2D) (None, 512, 512, 256 295168 batch_normalization_2[0][0] __________________________________________________________________________________________________ leaky_re_lu_4 (LeakyReLU) (None, 512, 512, 256 0 conv2d_4[0][0] __________________________________________________________________________________________________ batch_normalization_3 (BatchNor (None, 512, 512, 256 1024 leaky_re_lu_4[0][0] __________________________________________________________________________________________________ up_sampling2d (UpSampling2D) (None, 1024, 1024, 2 0 batch_normalization_3[0][0] __________________________________________________________________________________________________ conv2d_5 (Conv2D) (None, 1024, 1024, 1 295040 up_sampling2d[0][0] __________________________________________________________________________________________________ leaky_re_lu_5 (LeakyReLU) (None, 1024, 1024, 1 0 conv2d_5[0][0] __________________________________________________________________________________________________ batch_normalization_4 (BatchNor (None, 1024, 1024, 1 512 leaky_re_lu_5[0][0] __________________________________________________________________________________________________ up_sampling2d_1 (UpSampling2D) (None, 2048, 2048, 1 0 batch_normalization_4[0][0] __________________________________________________________________________________________________ conv2d_6 (Conv2D) (None, 2048, 2048, 6 73792 up_sampling2d_1[0][0] __________________________________________________________________________________________________ leaky_re_lu_6 (LeakyReLU) (None, 2048, 2048, 6 0 conv2d_6[0][0] __________________________________________________________________________________________________ concatenate (Concatenate) (None, 2048, 2048, 6 0 leaky_re_lu_6[0][0] input_1[0][0] __________________________________________________________________________________________________ conv2d_7 (Conv2D) (None, 2048, 2048, 6 37504 concatenate[0][0] __________________________________________________________________________________________________ leaky_re_lu_7 (LeakyReLU) (None, 2048, 2048, 6 0 conv2d_7[0][0] __________________________________________________________________________________________________ batch_normalization_5 (BatchNor (None, 2048, 2048, 6 256 leaky_re_lu_7[0][0] __________________________________________________________________________________________________ conv2d_8 (Conv2D) (None, 2048, 2048, 3 18464 batch_normalization_5[0][0] __________________________________________________________________________________________________ leaky_re_lu_8 (LeakyReLU) (None, 2048, 2048, 3 0 conv2d_8[0][0] __________________________________________________________________________________________________ conv2d_9 (Conv2D) (None, 2048, 2048, 2 578 leaky_re_lu_8[0][0] ================================================================================================== Total params: 820,386 Trainable params: 819,042 Non-trainable params: 1,344 __________________________________________________________________________________________________

2条回答

网友

1楼 · 编辑于 2024-04-25 16:44:37

这将是最简单的自定义训练循环

def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)

这里，我的数据迭代器在所有彩色图片中循环，但在传递到模型之前，它已转换为灰度。然后，将模型的RGB输出与原始RGB图像进行比较。您必须在flow_from_directory中使用参数class_mode=None。我使用tf.image.rgb_to_grayscale在灰度和RGB之间进行转换

完整示例：

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
import os

os.chdir(r'catsanddogs')

generator = tf.keras.preprocessing.image.ImageDataGenerator()
iterator = generator.flow_from_directory(
    target_size=(32, 32),
    directory='.',
    batch_size=4,
    class_mode=None)

encoder = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(32, 32, 1)),
    tf.keras.layers.Dense(32),
    tf.keras.layers.Dense(16)
])

decoder = tf.keras.Sequential([
    tf.keras.layers.Dense(32, input_shape=[16]),
    tf.keras.layers.Dense(32 * 32 * 3),
    tf.keras.layers.Reshape([32, 32, 3])
])


autoencoder = tf.keras.Sequential([encoder, decoder])

loss_object = tf.losses.BinaryCrossentropy()

reconstruction_loss = tf.metrics.Mean(name='reconstruction_loss')

optimizer = tf.optimizers.Adam()


def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)


if __name__ == '__main__':
    template = 'Epoch {:2} Reconstruction Loss {:.4f}'
    for epoch in range(50):
        reconstruction_loss.reset_states()
        for input_batches in iterator:
            reconstruct(input_batches)
        print(template.format(epoch + 1, reconstruction_loss.result()))

网友

2楼 · 编辑于 2024-04-25 16:44:37

我将用示例代码发布我的评论：

您需要使用批处理来训练模型。例如，如果希望在一个历元中使用500个图像，则可以每批处理50个图像，而改为10个历元。这样，您只需在内存中加载50个图像。您必须设置批大小并将shuffle设置为True，以便批具有不同的图像

如果您的图像位于目录中，则上面的注释将转换为以下内容：

from keras_preprocessing.image import ImageDataGenerator
preprocessing_images = ImageDataGenerator()

train_generator = preprocessing_images.flow_from_directory(
        train_path,
        target_size=target_size,
        batch_size=50,
        class_mode="categorical", # classes are provided in categorical format for a 2-unit output layer
        shuffle=True,
        color_mode="grayscale",
        seed=1234567890
    )

这里有一个生成器，它可以生成50幅图像。我没有为您的具体问题指定参数，您需要更改目标大小和所有这些内容。请注意，这是一个无限的生成器，它将返回无限多个批次和50个图像。您可以将其包装到tf.data.Dataset或指定每个历元的步骤。有几种方法。我希望有帮助。如果你还有问题，我会详细说明答案（我现在很忙）

相关问题更多 >

编程相关推荐

热门问题

热门文章