ValueError:形状不匹配：标签的形状（已接收（1，））应与Logit的形状相等，但最后一个尺寸除外（已接收（10，30））

2024-06-08 05:22:51 发布

男 | 程序猿一只，喜欢编程写python代码。

我是tensorflow的新手，非常感谢您的回答。我尝试使用transformer模型作为嵌入层，并将数据提供给自定义模型

from transformers import TFAutoModel
from tensorflow.keras import layers
def build_model():
    transformer_model = TFAutoModel.from_pretrained(MODEL_NAME, config=config)
    
    input_ids_in = layers.Input(shape=(MAX_LEN,), name='input_ids', dtype='int32')
    input_masks_in = layers.Input(shape=(MAX_LEN,), name='attention_mask', dtype='int32')

    embedding_layer = transformer_model(input_ids_in, attention_mask=input_masks_in)[0]

    X = layers.Bidirectional(tf.keras.layers.LSTM(50, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))(embedding_layer)
    X = layers.GlobalMaxPool1D()(X)
    X = layers.Dense(64, activation='relu')(X)
    X = layers.Dropout(0.2)(X)
    X = layers.Dense(30, activation='softmax')(X)

    model = tf.keras.Model(inputs=[input_ids_in, input_masks_in], outputs = X)

    for layer in model.layers[:3]:
        layer.trainable = False

    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

    
model = build_model()
model.summary()
r = model.fit(
            train_ds,
            steps_per_epoch=train_steps,
            epochs=EPOCHS,
            verbose=3)

我有30个类，标签不是一个热编码的，所以我使用稀疏分类交叉熵作为我的损失函数，但我一直得到以下错误

ValueError: Shape mismatch: The shape of labels (received (1,)) should equal the shape of logits except for the last dimension (received (10, 30)).

我怎样才能解决这个问题？为什么需要（10，30）形状？我知道30是因为最后一个密度层有30个单位，但为什么是10个呢？是因为最大长度是10吗

我的模型摘要：

Model: "model_16"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_ids (InputLayer)          [(None, 10)]         0                                            
__________________________________________________________________________________________________
attention_mask (InputLayer)     [(None, 10)]         0                                            
__________________________________________________________________________________________________
tf_bert_model_21 (TFBertModel)  TFBaseModelOutputWit 162841344   input_ids[0][0]                  
                                                                 attention_mask[0][0]             
__________________________________________________________________________________________________
bidirectional_17 (Bidirectional (None, 10, 100)      327600      tf_bert_model_21[0][0]           
__________________________________________________________________________________________________
global_max_pooling1d_15 (Global (None, 100)          0           bidirectional_17[0][0]           
__________________________________________________________________________________________________
dense_32 (Dense)                (None, 64)           6464        global_max_pooling1d_15[0][0]    
__________________________________________________________________________________________________
dropout_867 (Dropout)           (None, 64)           0           dense_32[0][0]                   
__________________________________________________________________________________________________
dense_33 (Dense)                (None, 30)           1950        dropout_867[0][0]                
==================================================================================================
Total params: 163,177,358
Trainable params: 336,014
Non-trainable params: 162,841,344

Tags： in 模型 none layer ids input model layers

1条回答

网友

1楼 · 发布于 2024-06-08 05:22:51

10是一批中的多个序列。我怀疑这是数据集中的一系列序列

您的模型充当序列分类器。所以每个序列都应该有一个标签

ValueError:形状不匹配：标签的形状（已接收（1，））应与Logit的形状相等，但最后一个尺寸除外（已接收（10，30））

相关问题更多 >

编程相关推荐

热门问题

热门文章

ValueError:形状不匹配：标签的形状（已接收（1，））应与Logit的形状相等，但最后一个尺寸除外（已接收（10，30））

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >