基于Keras的半监督学习

2024-04-25 15:05:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用Keras实现一种半监督学习方法。其思想是在潜在空间上使用变分自动编码器(VAE)与分类器相结合。只要数据集不包含标记的数据,模型就工作得很好,并且两个模型部分都经过了训练。当添加未标记的数据时,只有VAE应该被训练,而分类器应该保持不变

解决此问题的一个选项是第二个输入层,用于提供哪些数据点已标记,哪些未标记的信息(已标记=1,未标记=0)。结合使用该信息和Lambda层,我可以动态地将输出/预测设置为伪目标值(预测分类器=伪目标),从而停止训练分类器权重。这种方法效果很好,但确实不令人满意,导致学习曲线错误

是否有任何解决方案来定义哪个数据点用于训练分类器和VAE(标记),哪个数据点仅用于训练VAE(未标记)

'''
=================
Encoder
=================
'''

# Definition
e_i     = Input(shape=input_shape, name='encoder_input')
x       = Dense(n_neurons, activation="relu")(e_i)
x       = Dense(n_neurons, activation="relu")(x)
mu      = Dense(latent_dim, name='latent_mu')(x)
sigma   = Dense(latent_dim, name='latent_sigma')(x)

# Define sampling with reparameterization trick
def sample_z(args):
  mu, sigma = args
  batch     = K.shape(mu)[0]
  dim       = K.int_shape(mu)[1]
  eps       = K.random_normal(shape=(batch, dim))
  return mu + K.exp(sigma / 2) * eps

# Use reparameterization trick
z       = Lambda(sample_z, output_shape=(latent_dim, ), name='z')([mu, sigma])

# Instantiate encoder
encoder = Model(e_i, [mu, sigma, z], name='encoder')
encoder.summary()

'''
=================
Decoder
=================
'''

# Definition
d_i   = Input(shape=(latent_dim, ), name='decoder_input')
x     = Dense(n_neurons, activation="relu")(d_i)
x     = Dense(n_neurons, activation="relu")(x)
o     = Dense(input_dim, activation="sigmoid", name='decoder_output')(x)

# Instantiate decoder
decoder = Model(d_i, o, name='decoder')
decoder.summary()

'''
=================
Classifier
=================
'''

# Definition
c_i   = Input(shape=(latent_dim, ), name="classifier_input")

idx_sup  = Input(batch_shape=(None, 1), name="idx_supervised", dtype='int32')
        
def index_supervised(args):
    c, idx_sup = args
    o_l = K.switch(idx_sup == 0, c*0, c)
    return o_l

x = Lambda(index_supervised)([c_i, idx_sup]) 
        
x     = Dense(n_neurons_c, activation="relu")(x)
x     = Dense(n_neurons_c, activation="relu")(x)
c     = Dense(3, activation="softmax")(x)

# Instantiate classifier
classifier = Model([c_i, idx_sup], c, name='classifier')
classifier.summary()

'''
=================
VAE + classifier as a whole
=================
'''

# Instantiate VAE
vae_outputs = decoder(encoder(e_i)[2])
classifier_output = classifier([encoder(e_i)[0], idx_sup])
vae_classifier_semi    = Model([e_i, idx_sup], [vae_outputs, classifier_output], name='vae_classifier_semi')
vae_classifier_semi.summary()

# Define loss
def kl_reconstruction_loss(true, pred):
  # Reconstruction loss
  reconstruction_loss = mse(K.flatten(true), K.flatten(pred)) * input_dim
  # KL divergence loss
  kl_loss = 1 + sigma - K.square(mu) - K.exp(sigma)
  kl_loss = K.sum(kl_loss, axis=-1)
  kl_loss *= -0.5
  # Total loss = 50% rec + 50% KL divergence loss
  return K.mean(reconstruction_loss + kl_loss)


# Compile VAE
opt = keras.optimizers.Adam(learning_rate=optimizer_learning_rate)
vae_classifier_semi.compile(optimizer=opt, loss=[kl_reconstruction_loss, 'sparse_categorical_crossentropy'], metrics={'classifier': 'accuracy'}, loss_weights=loss_weights)

def index_supervised(args):是不令人满意的解决方案。每当idx_sup == 0时,它用于将输出设置为0(目标也为0)。在我的模型中,idx_sup在标记数据点时提供1,在伪标记(未标记)数据点时提供0


Tags: 数据name标记encoderactivationsigmadenseshape