当使用与执行训练时相同的数据进行预测时,会发生错误(预期3个输入,但收到75个输入张量。)

2024-04-19 13:42:29 发布

您现在位置:Python中文网/ 问答频道 /正文

在训练模型后,我尝试进行预测,但出现了一个错误,我不知道如何修复它

该模型是使用electra构建的

这是我的型号

electra = TFElectraModel.from_pretrained("monologg/koelectra-base-v3-discriminator", from_pt=True)
input_ids = tf.keras.Input(shape=(MAX_LEN,), name='input_ids', dtype=tf.int32)
mask = tf.keras.Input(shape=(MAX_LEN,), name='attention_mask', dtype=tf.int32)
token = tf.keras.Input(shape=(MAX_LEN,), name='token_type_ids', dtype=tf.int32)
embeddings = electra(input_ids, attention_mask = mask, token_type_ids= token)[0]
X = tf.keras.layers.GlobalMaxPool1D()(embeddings)
X = tf.keras.layers.BatchNormalization()(X)
X = tf.keras.layers.Dense(128, activation='relu')(X)
X = tf.keras.layers.Dropout(0.1)(X)
y = tf.keras.layers.Dense(3, activation='softmax', name='outputs')(X)
model = tf.keras.Model(inputs=[input_ids, mask, token], outputs=y)
model.layers[2].trainable=False
model.summary()

这里是总结

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_ids (InputLayer)          [(None, 25)]         0                                            
__________________________________________________________________________________________________
attention_mask (InputLayer)     [(None, 25)]         0                                            
__________________________________________________________________________________________________
token_type_ids (InputLayer)     [(None, 25)]         0                                            
__________________________________________________________________________________________________
tf_electra_model_4 (TFElectraMo TFBaseModelOutput(la 112330752   input_ids[0][0]                  
                                                                 attention_mask[0][0]             
                                                                 token_type_ids[0][0]             
__________________________________________________________________________________________________
global_max_pooling1d_6 (GlobalM (None, 768)          0           tf_electra_model_4[3][0]         
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 768)          3072        global_max_pooling1d_6[0][0]     
__________________________________________________________________________________________________
dense_18 (Dense)                (None, 128)          98432       batch_normalization_7[0][0]      
__________________________________________________________________________________________________
dropout_390 (Dropout)           (None, 128)          0           dense_18[0][0]                   
__________________________________________________________________________________________________
outputs (Dense)                 (None, 3)            387         dropout_390[0][0]                
==================================================================================================
Total params: 112,432,643
Trainable params: 112,431,107
Non-trainable params: 1,536
__________________________________________________________________________________________________

这是创建列车数据集的代码。

input_ids = []
attention_masks = []
token_type_ids = []
train_data_labels = []

for train_sent, train_label in tqdm(zip(train_data["content"], train_data["label"]), total=len(train_data)):
    try:
        input_id, attention_mask, token_type_id = Electra_tokenizer(train_sent, MAX_LEN)
        input_ids.append(input_id)
        attention_masks.append(attention_mask)
        token_type_ids.append(token_type_id)
        train_data_labels.append(train_label)

    except Exception as e:
        print(e)
        print(train_sent)
        pass

train_input_ids = np.array(input_ids, dtype=int)
train_attention_masks = np.array(attention_masks, dtype=int)
train_type_ids = np.array(token_type_ids, dtype=int)
intent_train_inputs = (train_input_ids, train_attention_masks, train_type_ids)
intent_train_data_labels = np.asarray(train_data_labels, dtype=np.int32)

这是列车数据集形状

tf.Tensor([ 3 75 25], shape=(3,), dtype=int32)

使用此列车数据,模型列车工作正常,但执行以下代码进行预测,出现错误。

sample_text = 'this is sample text'
input_id, attention_mask, token_type_id = Electra_tokenizer(sample_text, MAX_LEN)
sample_text = (input_id, attention_mask, token_type_id)
model(sample_text) #or model.predict(sample_text)

这里是错误

Layer model_15 expects 3 input(s), but it received 75 input tensors. Inputs received: [<tf.Tensor: shape=(), dtype=int32, numpy=2>, <tf.Tensor: ....

它的形状与我训练时的形状相同,但为什么我会遇到错误并请求帮助解决它呢

希望你有一个伟大的一年。新年快乐


Tags: tokennoneididsinputdatamodeltf
1条回答
网友
1楼 · 发布于 2024-04-19 13:42:29

这是一个张量维问题

test_input_ids = np.array(test_input_ids, dtype=np.int32)
test_attention_mask = np.array(test_attention_mask, dtype=np.int32)
test_token_type_id = np.array(test_token_type_id, dtype=np.int32)
ids = np.expand_dims(test_input_ids, axis=0)
atm = np.expand_dims(test_attention_mask, axis=0)
tok = np.expand_dims(test_token_type_id, axis=0)
model(ids,atm.tok) works fine

相关问题 更多 >