我想在google云上运行一个对象检测模型的培训工作。在记录每个ps复制副本中的以下内容后,它将失败。在
Check failed: DeviceNameUtils::ParseFullName(new_base, &parsed_name)
{
insertId: "1am4lt7g2ytgyip"
jsonPayload: {
created: 1532870862.316736
levelname: "CRITICAL"
lineno: 27
message: "Check failed: DeviceNameUtils::ParseFullName(new_base, &parsed_name) "
pathname: "tensorflow/core/common_runtime/renamed_device.cc"
}
labels: {
compute.googleapis.com/resource_id: "8188383009228980271"
compute.googleapis.com/resource_name: "cmle-training-ps-1d73aafb3a-0-7bjnw"
compute.googleapis.com/zone: "us-central1-a"
ml.googleapis.com/job_id: "object_detection_07_29_2018_14_17_36"
ml.googleapis.com/job_id/log_area: "root"
ml.googleapis.com/task_name: "ps-replica-0"
ml.googleapis.com/trial_id: ""
}
logName: "projects/object-detection-210310/logs/ps-replica-0"
receiveTimestamp: "2018-07-29T13:27:48.515404065Z"
resource: {
labels: {
job_id: "object_detection_07_29_2018_14_17_36"
project_id: "object-detection-210310"
task_name: "ps-replica-0"
}
type: "ml_job"
}
severity: "CRITICAL"
timestamp: "2018-07-29T13:27:42.316735982Z"
}
接下来是:
^{pr2}$Tfcki在训练后成功地替换了一个文件。但问题依然存在。唯一的区别是bucket名称,我在config文件和training job submission命令中更改了这个名称。在
请帮忙。在
我想我发现了问题。我把Tensorflow包含在设置.py,试图克服之前面临的另一个问题。删除后,没有出现此错误。非常感谢大家。在
相关问题 更多 >
编程相关推荐