我在GPU集群上训练了TensorFlow模型,用
saver = tf.train.Saver()
saver.save(sess, config.model_file, global_step=global_step)
现在我正试图用
^{pr2}$用于评估,在不同的系统上。问题是saver.restore
产生以下错误:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/jonpdeaton/Developer/BraTS18-Project/segmentation/evaluate.py", line 205, in <module>
main()
File "/Users/jonpdeaton/Developer/BraTS18-Project/segmentation/evaluate.py", line 162, in main
restore_and_evaluate(save_path, model_file, output_dir)
File "/Users/jonpdeaton/Developer/BraTS18-Project/segmentation/evaluate.py", line 127, in restore_and_evaluate
saver.restore(sess, tf.train.latest_checkpoint(save_path))
File "/Users/jonpdeaton/anaconda3/envs/BraTS/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1857, in latest_checkpoint
if file_io.get_matching_files(v2_path) or file_io.get_matching_files(
File "/Users/jonpdeaton/anaconda3/envs/BraTS/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 337, in get_matching_files
for single_filename in filename
File "/Users/jonpdeaton/anaconda3/envs/BraTS/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /afs/cs.stanford.edu/u/jdeaton/dfs/unet; No such file or directory
似乎有一些存储在模型或checkpoint
文件中的路径,这些路径在我正在进行评估的系统上不再有效。复制了model-X.meta
、model-X.index
和checkpoint
文件后,如何在另一台计算机上恢复模型(用于评估)?在
默认情况下,
Saver
对象将把绝对模型检查点路径写入checkpoint
文件。因此,tf.train.latest_checkpoint(save_path)
返回的路径是旧机器上的绝对路径。在临时解决方案:
restore
方法,而不是tf.train.latest_checkpoint
的结果。在checkpoint
文件,它是一个简单的文本文件。在长期解决方案:
用您最喜欢的文本编辑器打开检查点文件,只需将其中找到的绝对路径更改为仅文件名。在
相关问题 更多 >
编程相关推荐