分布式Tensorflow在sess.运行()

import tensorflow as tf cluster = tf.train.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]}) server = tf.train.Server(cluster, job_name="local", task_index=0) a = tf.constant(8) b = tf.constant(9) sess = tf.Session('grpc://localhost:2222')

1条回答

网友

1楼 · 发布于 2024-05-16 20:53:58

默认情况下，分布式TensorFlow将阻塞，直到^{}中命名的所有服务器都已启动。这发生在与服务器的第一次交互期间，通常是第一次sess.run()调用。因此，如果您还没有启动监听localhost:2223的服务器，那么TensorFlow将阻塞，直到您启动为止。在

根据您以后的目标，有几种解决方案：

在localhost:2223上启动服务器。在另一个进程中，运行以下脚本：

 import tensorflow as tf
 cluster = tf.train.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
 server = tf.train.Server(cluster, job_name="local", task_index=1)
 server.join()  # Wait forever for incoming connections.

从原始tf.train.ClusterSpec中删除任务1：

 import tensorflow as tf
 cluster = tf.train.ClusterSpec({"local": ["localhost:2222"]})
 server = tf.train.Server(cluster, job_name="local", task_index=0)
 # ...

创建^{}时指定“设备筛选器”，以便会话仅使用任务0。在

 # ...
 sess = tf.Session("grpc://localhost:2222",
                   config=tf.ConfigProto(device_filters=["/job:local/task:0"]))

相关问题更多 >

编程相关推荐

热门问题

热门文章