Additionally, optional arguments to the Saver() constructor let you
control the proliferation of checkpoint files on disk:
max_to_keep indicates the maximum number of recent checkpoint files to
keep. As new files are created, older files are deleted. If None or 0,
no checkpoints are deleted from the filesystem but only the last one
is kept in the checkpoint file. Defaults to 5 (that is, the 5 most
recent checkpoint files are kept.)
keep_checkpoint_every_n_hours: In addition to keeping the most recent
max_to_keep checkpoint files, you might want to keep one checkpoint
file for every N hours of training. This can be useful if you want to
later analyze how a model progressed during a long training session.
For example, passing keep_checkpoint_every_n_hours=2 ensures that you
keep one checkpoint file for every 2 hours of training. The default
value of 10,000 hours effectively disables the feature.
有关保存间隔和要保留的检查点数量,请查看以下内容: https://www.tensorflow.org/api_docs/python/tf/train/Saver
从上面的链接
->;最大持续时间
->;每小时保留检查点
我相信如果你使用一个,你可以在训练配置中引用这个。签出py训练器文件位于同一旧目录中。在第375行附近,它引用keep_checkpoint_every\n_hours->;
它没有引用的是可能需要添加到脚本中的max-tu-to-_-keep行。也就是说,最后,虽然很难确定没有所有的信息,但我不能不认为你这样做是错误的。收集每一个检查点并进行复查似乎不是处理过度拟合的正确方法。运行tensorboard,检查你在那里的训练结果。另外,使用带有评估数据的模型进行一些评估,还可以提供大量关于模型正在做什么的洞察。在
祝你训练顺利!在
相关问题 更多 >
编程相关推荐