Keras调谐器中带有外部参数的自定义损耗

2024-06-16 09:57:20 发布

您现在位置:Python中文网/ 问答频道 /正文

虽然我的代码在Keras Tuner和标准损耗函数(如“mse”)上运行时没有任何问题,但我正在试图找出如何编写一个自定义损耗函数,该函数除了接受true和y外,还接受一个外部参数,以便在Keras Tuner内部用于LSTM模型选择。我正在寻找最简单和不那么痛苦的方法,但在以前的帖子中我没有找到有效的解决方法

我遵循的一个方法是这个。假设我有这些变量

# external vector needed in custom loss function
ex_loss= np.logical_not(klines_backtest.loc[i_sel,['d']].to_numpy(dtype=np.float32)[:sample_start])
# create data sequences for x and vector to forecasy y
x_train, y_train = lstm_data_sequence(dataset[:sample_start,:-1], dataset[:sample_start,-1], lstm_sequence)
# concatenate external vector to y so y shape is Nx2
y_train = np.vstack((y_train, ex_loss[lstm_sequence:,0])).T

我定义了以下损失函数

def bande_loss(y_true, y_pred):
    mse = K.square(y_pred - y_true[:,0])
    i_loss = K.equal(y_true[:,1], 1) and K.greater_equal(y_pred, y_true[:,0])
    i_loss = K.cast(~i_loss, 'float32')
    return K.mean(mse*i_loss)

基本上,我试图避免损失函数覆盖传递额外变量(与y_true的大小相同),我需要在y_序列中的损失函数中传递额外变量,在此过程中,我希望y_true和相应的外部变量的大小与批次的大小正确

模型选择的LSTM是

def lstm_model(hp):
    model = Sequential()
    model.add(InputLayer(input_shape=(48*3, 13)))
    num_layers = hp.Int('num_layers', min_value=4, max_value=8, step=2)
    num_units = hp.Choice('units', values=[50, 100, 250, 500])
    n_dropout = hp.Choice('n_dropout', values=[float(0), 0.10, 0.20])
    n_rec_dropout = hp.Choice('n_rec_dropout', values=[float(0), 0.10, 0.20])
    learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4, 1e-5, 1e-6])
    for i in range(num_layers):
        if i < num_layers - 1:
            r_sequence = True
        else:
            r_sequence = False
        model.add(LSTM(
            units=num_units,
            dropout=n_dropout,
            recurrent_dropout=n_rec_dropout,
            return_sequences=r_sequence))

    model.add(Dense(1))
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
        loss=bande_loss,
        metrics=[bande_loss])
    return model

执行此代码

tuner = Hyperband(
    hypermodel=lstm_model,
    objective=Objective("bande_loss", direction="min"),
    max_epochs=50,
    hyperband_iterations=2,
    executions_per_trial=1,
    overwrite=True,
    project_name='hyperband_tuner')
stop_early = tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=3, verbose=1)
tuner.search(x_train, y_train, epochs=30, validation_split=p_train, callbacks=[stop_early],
    shuffle=False, verbose=1)

我得到这个错误

 The second input must be a scalar, but it has shape [32]
     [[{{node bande_loss/cond/switch_pred/_2736}}]] [Op:__inference_train_function_45266]

Function call stack:
train_function

请注意,32是(默认)批次大小

也运行相同的代码

def bande_loss(y_true, y_pred):
    mse = K.square(y_pred - y_true[:,0])
    return K.mean(mse)

看起来在跑步时效果不错

def bande_loss(y_true, y_pred):
    mse = K.square(y_pred - y_true[:,1])
    return K.mean(mse)

给了我同样的错误,我不明白为什么

我也用这种方法尝试了损失函数覆盖

def lstm_model(hp):
    model = Sequential()
    model.add(InputLayer(input_shape=(48*3, 13)))
    num_layers = hp.Int('num_layers', min_value=4, max_value=8, step=2)
    num_units = hp.Choice('units', values=[50, 100, 250, 500])
    n_dropout = hp.Choice('n_dropout', values=[float(0), 0.10, 0.20])
    n_rec_dropout = hp.Choice('n_rec_dropout', values=[float(0), 0.10, 0.20])
    learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4, 1e-5, 1e-6])
    for i in range(num_layers):
        if i < num_layers - 1:
            r_sequence = True
        else:
            r_sequence = False
        model.add(LSTM(
            units=num_units,
            dropout=n_dropout,
            recurrent_dropout=n_rec_dropout,
            return_sequences=r_sequence))

    model.add(Dense(1))
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
        loss=bande_loss(ex_loss),
        metrics=[bande_loss(ex_loss)])
    return model

def bande_loss(ex_loss):
    def loss(y_true, y_pred):
        mse = K.square(y_pred - y_true)
        i_loss = K.equal(ex_loss, True) and K.greater_equal(y_pred, y_true)
        i_loss = K.cast(~i_loss, 'float32')
        return K.mean(mse*i_loss)
    return loss

...

# external vector needed in custom loss function
ex_loss= np.logical_not(klines_backtest.loc[i_sel,['d']].to_numpy(dtype=np.float32)[:sample_start])
# create data sequences for x and vector to forecasy y
x_train, y_train = lstm_data_sequence(dataset[:sample_start,:-1], dataset[:sample_start,-1], lstm_sequence)
ex_loss = K.variable(ex_loss[lstm_sequence:], dtype=bool)

tuner = Hyperband(
    hypermodel=lstm_model,
    objective=Objective("bande_loss(ex_loss)", direction="min"),
    max_epochs=50,
    hyperband_iterations=2,
    executions_per_trial=1,
    overwrite=True,
    project_name='hyperband_tuner')
stop_early = tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=3, verbose=1)
tuner.search(x_train, y_train, epochs=30, validation_split=p_train, callbacks=[stop_early],
    shuffle=False, verbose=1)

但是我得到了这个错误

tensorflow.python.framework.errors_impl.InvalidArgumentError:  The second input must be a scalar, but it has shape [4176]
         [[{{node cond/switch_pred/_12}}]] [Op:__inference_train_function_34471]

Function call stack:
train_function

有人能为我提供帮助或一种更简单有效的方法来实现Keras调谐器内部带有外部参数的自定义损耗功能吗


Tags: truemodelreturnlayerstrainnumexdropout