我试图了解TensorFlow在不同地方报告的损失值。这是我学到的:
verbose=0
或verbose=2
李>model.fit
返回的损失值将与同一数据上的model.evaluate
不匹配,因为每个批次使用其权重版本进行计算李>model.fit
和model.evaluate
返回的损失值不是just the result of the chosen loss function,它包括来自regularizers的其他贡献李>但是……哪些捐款?如何识别它们?例如,本示例将对一个复杂的、经过预训练的Bert模型进行网格化:
import os
import numpy as np
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "1"
import tensorflow as tf
from transformers import BertConfig, BertTokenizer, TFBertModel
tf.random.set_seed(42)
model_name = 'bert-base-multilingual-cased'
max_length = 2
bert_config = BertConfig.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name)
batch_encoding = tokenizer(text=["truth", "lies"],
max_length=max_length,
padding="max_length",
truncation=True,
return_attention_mask=False,
return_token_type_ids=False,
return_tensors="tf")
x_true = batch_encoding.data
y_true = tf.constant([[1], [2]], dtype=tf.float32)
bert_input = tf.keras.Input(name="input_ids",
shape=(max_length, ),
dtype=tf.int32)
bert_model = TFBertModel(config=bert_config)(bert_input)
model = tf.keras.Model(inputs=[bert_input], outputs=[bert_model.pooler_output])
bert = model.layers[-1]
bert.trainable = False
model.compile(loss=tf.keras.losses.MSE, optimizer=tf.keras.optimizers.SGD())
print("Trainable variables:", len(model.trainable_variables))
initial_weights_bert = bert.get_weights().copy()
initial_y_pred = model.predict(x_true, verbose=0)
y_true = np.zeros_like(initial_y_pred)
def inspect(epoch, logs):
for i, w in enumerate(bert.get_weights()):
assert (w == initial_weights_bert[i]).all()
loss_logged = logs["loss"]
loss_evaluated = model.evaluate(x_true, y_true, verbose=0)
y_pred = model.predict(x_true, verbose=0)
assert (y_pred == initial_y_pred).all()
loss_computed = tf.math.reduce_mean(model.loss(y_pred, y_true)).numpy()
print(f"\tloss logged: {loss_logged:.4f}")
print(f"\tloss evaluated: {loss_evaluated:.4f}")
print(f"\tloss computed: {loss_computed:.4f}")
print("\tmodel.losses:", ", ".join((str(loss) for loss in model.losses)))
h = model.fit(
x_true,
y_true,
epochs=5,
shuffle=False,
callbacks=[tf.keras.callbacks.LambdaCallback(on_epoch_end=inspect)],
verbose=2)
print(h.history["loss"])
我们冻结Bert层,我想这并不意味着什么,因为Bert不是一个规则的Keras层,所以表面下很可能有运动的部分。无论如何,我们断言所有权重保持不变,所有预测保持不变,但损失值变化很大:
Trainable variables: 0
Epoch 1/5
1/1 - 7s - loss: 0.1582
loss logged: 0.1582
loss evaluated: 0.1491
loss computed: 0.1491
model.losses:
Epoch 2/5
1/1 - 0s - loss: 0.1546
loss logged: 0.1546
loss evaluated: 0.1491
loss computed: 0.1491
model.losses:
Epoch 3/5
1/1 - 0s - loss: 0.1534
loss logged: 0.1534
loss evaluated: 0.1491
loss computed: 0.1491
model.losses:
Epoch 4/5
1/1 - 0s - loss: 0.1532
loss logged: 0.1532
loss evaluated: 0.1491
loss computed: 0.1491
model.losses:
Epoch 5/5
1/1 - 0s - loss: 0.1557
loss logged: 0.1557
loss evaluated: 0.1491
loss computed: 0.1491
model.losses:
[0.1582154631614685, 0.15455231070518494, 0.15339043736457825, 0.15316224098205566, 0.15569636225700378]
model.losses
属性没有显示任何内容,奇特的TensorBoard图(如果添加TensorBoard回调)也没有显示任何内容
目前没有回答
相关问题 更多 >
编程相关推荐