在Keras图像输入中训练CNN的传输学习不起作用,但是向量输入有效

2024-04-19 12:24:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我想在Keras进行迁移学习。我设置了一个ResNet50网络,设置为不可培训,但有一些额外的层:

# Image input
model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg')) # output is 2048
model.add(Dropout(0.05))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.15))
model.add(Dense(512, activation='relu'))
model.add(Dense(7, activation='softmax'))
model.layers[0].trainable = False
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

然后我使用ResNet50preprocess_input函数和一个热编码标签y_batch创建输入数据:x_batch,并按如下方式进行拟合:

^{pr2}$

训练的准确率在10个左右的时期后接近100%,但验证准确率实际上从50%左右下降到30%,验证损失稳步增加。在

但是,如果我只创建最后一层的网络:

# Vector input
model2 = Sequential()
model2.add(Dropout(0.05, input_shape=(2048,)))
model2.add(Dense(512, activation='relu'))
model2.add(Dropout(0.15))
model2.add(Dense(512, activation='relu'))
model2.add(Dense(7, activation='softmax'))
model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model2.summary()

并输入ResNet50预测的输出:

resnet = ResNet50(include_top=False, pooling='avg')
x_batch = resnet.predict(x_batch)

验证准确率达到85%左右。。。怎么回事?为什么图像输入法不起作用?在

更新:

这个问题真的很奇怪。如果我把ResNet50改成VGG19,它似乎可以工作。在


Tags: 网络addfalseinputmodelincludebatchactivation
1条回答
网友
1楼 · 发布于 2024-04-19 12:24:00

你可以试试:

Res = keras.applications.resnet.ResNet50(include_top=False, 
              weights='imagenet',  input_shape=(IMG_SIZE , IMG_SIZE , 3 ) )


    # Freeze the layers except the last 4 layers
for layer in vgg_conv.layers  :
   layer.trainable = False

# Check the trainable status of the individual layers
for layer in vgg_conv.layers:
    print(layer, layer.trainable)

# Vector input
model2 = Sequential()
model2.add(Res)
model2.add(Flatten())
model2.add(Dropout(0.05 ))
model2.add(Dense(512, activation='relu'))
model2.add(Dropout(0.15))
model2.add(Dense(512, activation='relu'))
model2.add(Dense(7, activation='softmax'))
model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics =(['accuracy'])
model2.summary()
网友
2楼 · 发布于 2024-04-19 12:24:00

经过大量的谷歌搜索,我发现问题出在ResNet的批量标准化层。VGGNet中没有批处理规范化层,这就是它适用于该拓扑的原因。在

在Kerashere中有一个pull请求来修复这个问题,它详细解释了:

Assume we use one of the pre-trained CNNs of Keras and we want to fine-tune it. Unfortunately, we get no guarantees that the mean and variance of our new dataset inside the BN layers will be similar to the ones of the original dataset. As a result, if we fine-tune the top layers, their weights will be adjusted to the mean/variance of the new dataset. Nevertheless, during inference the top layers will receive data which are scaled using the mean/variance of the original dataset. This discrepancy can lead to reduced accuracy.

这意味着BN层正在根据训练数据进行调整,但是在进行验证时,使用BN层的原始参数。据我所知,解决方法是允许冻结的BN层使用训练的更新的均值和方差。在

解决方法是预先计算ResNet输出。事实上,这大大减少了训练时间,因为我们没有重复这部分计算。在

相关问题 更多 >