我正在尝试用Keras实现一种半监督学习方法。其思想是在潜在空间上使用变分自动编码器(VAE)与分类器相结合。只要数据集不包含标记的数据,模型就工作得很好,并且两个模型部分都经过了训练。当添加未标记的数据时,只有VAE应该被训练,而分类器应该保持不变
解决此问题的一个选项是第二个输入层,用于提供哪些数据点已标记,哪些未标记的信息(已标记=1,未标记=0)。结合使用该信息和Lambda层,我可以动态地将输出/预测设置为伪目标值(预测分类器=伪目标),从而停止训练分类器权重。这种方法效果很好,但确实不令人满意,导致学习曲线错误
是否有任何解决方案来定义哪个数据点用于训练分类器和VAE(标记),哪个数据点仅用于训练VAE(未标记)
'''
=================
Encoder
=================
'''
# Definition
e_i = Input(shape=input_shape, name='encoder_input')
x = Dense(n_neurons, activation="relu")(e_i)
x = Dense(n_neurons, activation="relu")(x)
mu = Dense(latent_dim, name='latent_mu')(x)
sigma = Dense(latent_dim, name='latent_sigma')(x)
# Define sampling with reparameterization trick
def sample_z(args):
mu, sigma = args
batch = K.shape(mu)[0]
dim = K.int_shape(mu)[1]
eps = K.random_normal(shape=(batch, dim))
return mu + K.exp(sigma / 2) * eps
# Use reparameterization trick
z = Lambda(sample_z, output_shape=(latent_dim, ), name='z')([mu, sigma])
# Instantiate encoder
encoder = Model(e_i, [mu, sigma, z], name='encoder')
encoder.summary()
'''
=================
Decoder
=================
'''
# Definition
d_i = Input(shape=(latent_dim, ), name='decoder_input')
x = Dense(n_neurons, activation="relu")(d_i)
x = Dense(n_neurons, activation="relu")(x)
o = Dense(input_dim, activation="sigmoid", name='decoder_output')(x)
# Instantiate decoder
decoder = Model(d_i, o, name='decoder')
decoder.summary()
'''
=================
Classifier
=================
'''
# Definition
c_i = Input(shape=(latent_dim, ), name="classifier_input")
idx_sup = Input(batch_shape=(None, 1), name="idx_supervised", dtype='int32')
def index_supervised(args):
c, idx_sup = args
o_l = K.switch(idx_sup == 0, c*0, c)
return o_l
x = Lambda(index_supervised)([c_i, idx_sup])
x = Dense(n_neurons_c, activation="relu")(x)
x = Dense(n_neurons_c, activation="relu")(x)
c = Dense(3, activation="softmax")(x)
# Instantiate classifier
classifier = Model([c_i, idx_sup], c, name='classifier')
classifier.summary()
'''
=================
VAE + classifier as a whole
=================
'''
# Instantiate VAE
vae_outputs = decoder(encoder(e_i)[2])
classifier_output = classifier([encoder(e_i)[0], idx_sup])
vae_classifier_semi = Model([e_i, idx_sup], [vae_outputs, classifier_output], name='vae_classifier_semi')
vae_classifier_semi.summary()
# Define loss
def kl_reconstruction_loss(true, pred):
# Reconstruction loss
reconstruction_loss = mse(K.flatten(true), K.flatten(pred)) * input_dim
# KL divergence loss
kl_loss = 1 + sigma - K.square(mu) - K.exp(sigma)
kl_loss = K.sum(kl_loss, axis=-1)
kl_loss *= -0.5
# Total loss = 50% rec + 50% KL divergence loss
return K.mean(reconstruction_loss + kl_loss)
# Compile VAE
opt = keras.optimizers.Adam(learning_rate=optimizer_learning_rate)
vae_classifier_semi.compile(optimizer=opt, loss=[kl_reconstruction_loss, 'sparse_categorical_crossentropy'], metrics={'classifier': 'accuracy'}, loss_weights=loss_weights)
def index_supervised(args):
是不令人满意的解决方案。每当idx_sup == 0
时,它用于将输出设置为0(目标也为0)。在我的模型中,idx_sup
在标记数据点时提供1,在伪标记(未标记)数据点时提供0
目前没有回答
相关问题 更多 >
编程相关推荐