TensorFlow保存模型错误“无法同步创建数据集（名称已存在）”

Question

我开发了一个人工智能模型，并成功地训练了它。不过，在我尝试保存这个模型的时候遇到了一个问题。系统提示说**h5**文件名已经存在，但这个情况有点奇怪，因为无论这个文件是否真的存在，错误信息都一直出现。有趣的是，当我在其他项目中测试类似的代码时，并没有遇到这个问题。

Error: Unable to synchronously create dataset (name already exists)
Traceback (most recent call last):
  File "C:\Users\Lenovo-Z\Documents\Text\Voice Line\main.py", line 333, in main
    model.save('VoiceLine_Model.h5')
  File "C:\Users\Lenovo-Z\.conda\envs\voiceline_myenv2\lib\site-packages\keras\src\utils\traceback_utils.py", line 123, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\Lenovo-Z\AppData\Roaming\Python\Python310\site-packages\h5py\_hl\group.py", line 183, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
  File "C:\Users\Lenovo-Z\AppData\Roaming\Python\Python310\site-packages\h5py\_hl\dataset.py", line 163, in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl, dapl=dapl)
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5d.pyx", line 137, in h5py.h5d.create
ValueError: Unable to synchronously create dataset (name already exists)

code:

def save_artifacts(tokenizer, encoder, model, embedding_matrix):
    # Save tokenizer as JSON
    tokenizer_data = {
        "word_index": tokenizer.word_index,
        "index_word": tokenizer.index_word,
        "word_counts": tokenizer.word_counts,
        "document_count": tokenizer.document_count
    }
    with open("tokenizer.pkl", "wb") as tokenizer_file:
        pickle.dump(tokenizer_data, tokenizer_file)

    # Save label encoder using pickle
    with open("label_encoder.pkl", "wb") as label_file:
        pickle.dump(encoder, label_file)

    # Save model architecture as JSON
    model_json = model.to_json()
    with open("model_architecture.json", "w") as json_file:
        json_file.write(model_json)

    # Save words using pickle
    with open("words.pkl", "wb") as words_file:
        pickle.dump(tokenizer.word_index, words_file)

    # Save classes using pickle
    with open("classes.pkl", "wb") as classes_file:
        pickle.dump(encoder.classes_, classes_file)

    # Save embedding matrix
    np.save("embedding_matrix.npy", embedding_matrix)

...

model = build_combined_model(embedding_dim=EMBEDDING_DIM, num_classes=len(encoder.classes_), vocab_size=len(tokenizer.word_index) + 1)

compile_model(model)

callbacks = get_callbacks()

model.fit(train_tokens, train_labels_one_hot, epochs=EPOCHS, batch_size=BATCH_SIZE, validation_data=(test_tokens, test_labels_one_hot), callbacks=callbacks)

print("Model training completed.")
model.save('VoiceLine_Model.h5')

错误处理人工智能 tensorflow 模型保存数据集训练模型 h5文件项目测试

TensorFlow保存模型错误“无法同步创建数据集（名称已存在）”

1 个回答

撰写回答