Keras自动编码器:验证损失>培训损失,但在测试数据上表现良好

2024-03-29 07:39:12 发布

您现在位置:Python中文网/ 问答频道 /正文

简而言之:

我训练了一个自动编码器,它的验证损失总是高于它的训练损失(见附图)。然而,我的自动编码器在测试数据集上表现良好。我想知道:

1)参考下面提供的网络架构,任何人都可以提供关于如何减少验证损失的见解(以及尽管自动编码器在测试数据集上的性能良好,但验证损失如何可能远远高于训练损失)

2)如果实际问题是培训和验证损失之间存在这种差距(当测试数据集的性能实际上很好时)。在

详细信息:

我用Keras编码了我的深度自动编码器(代码如下)。体系结构是2001(输入层)-1000-500-200-50-200-500-1000-2001(输出层)。我的例子是一维时间函数。它们都有2001年的时间成分。我有2000个样本,其中1500个用于培训,500个用于测试。在1500个训练样本中,有20%(即300个)被用作验证集。我将训练集标准化,去掉平均值并除以标准差。我使用训练数据集的平均值和标准差来规范化测试数据集。在

我使用Adamax优化器和均方误差作为损失函数来训练自动编码器。在

from tensorflow.keras.layers import Input, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras import optimizers

import numpy as np
import copy


# data
data = # read my input samples. They are 1d functions of time and I have 2000 of them.
# Each function has 2001 time components

# shuffling data before training
import random
random.seed(4)
random.shuffle(data)

# split training (1500 samples) and testing (500 samples) dataset
X_train = data[:1500]
X_test = data[1500:]

# normalize training and testing set using mean and std deviation of training set
X_mean = X_train.mean()
X_train -= X_mean
X_std = X_train.std()
X_train /= X_std

X_test -= X_mean
X_test /= X_std


### MODEL ###

# Architecture

# input layer
input_shape = [X_train.shape[1]]
X_input = Input(input_shape)

# hidden layers

x = Dense(1000, activation='tanh', name='enc0')(X_input)
encoded = Dense(500, activation='tanh', name='enc1')(x)
encoded_2 = Dense(200, activation='tanh', name='enc2')(encoded)
encoded_3 = Dense(50, activation='tanh', name='enc3')(encoded_2)
decoded_2 = Dense(200, activation='tanh', name='dec2')(encoded_3)
decoded_1 = Dense(500, activation='tanh', name='dec1')(decoded_2)
x2 = Dense(1000, activation='tanh', name='dec0')(decoded_1)

# output layer
decoded = Dense(input_shape[0], name='out')(x2)

# the Model
model = Model(inputs=X_input, outputs=decoded, name='autoencoder')

# optimizer
opt = optimizers.Adamax()
model.compile(optimizer=opt, loss='mse', metrics=['acc'])
print(model.summary())

###################

### TRAINING ###

epochs = 1000
# train the model
history = model.fit(x = X_train, y = X_train,
                    epochs=epochs,
                    batch_size=100,
                    validation_split=0.2)  # using 20% of training samples for validation

# Testing 
prediction = model.predict(X_test)
for i in range(len(prediction)):
    prediction[i] = np.multiply(prediction[i], X_std) + X_mean

loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(epochs)
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
plt.close()

Tags: nameimportinputdatatrainpltmeanactivation
1条回答
网友
1楼 · 发布于 2024-03-29 07:39:12

2) if it is actually a problem that there is this gap between training and validation loss (when the performance on the testing dataset is actually good).

这只是一个泛化差距,即训练集和验证集之间的预期差距;引用最近的一个blog post by Google AI

An important concept for understanding generalization is the generalization gap, i.e., the difference between a model’s performance on training data and its performance on unseen data drawn from the same distribution.

一。在

I would think that this is a signal of overfitting. However, my Autoencoder performs well on the testing dataset.

它是不是,但原因并不完全是你所想的(更不用说“好”是一个非常主观的术语)。在

过度拟合的信号是当你的验证损失开始增加,而你的培训损失持续减少时,即:

enter image description here

您的图形没有显示这种行为;另外,请注意上面图中曲线之间的间隙(双关语)(改编自Wikipedia entry on overfitting)。在

how it is possible that the validation loss is much higher than the training one, despite the performance of the Autoencoder being good on the testing dataset

这里绝对没有矛盾;注意,你的培训损失几乎为零,这本身并不一定令人惊讶,但如果验证损失接近于零,那肯定会令人惊讶。再说一遍,“好”是一个主观性很强的词。在

换句话说,你提供的信息中没有任何东西表明你的模型有问题。。。在

相关问题 更多 >