如何加速我的行刑速度？

import tensorflow as tf import time filename_queue = tf.train.string_input_producer(["hdfs://default/twitter/twitter_rv.net"], num_epochs=1, shuffle=False) def read_filename_queue(filename_queue): reader = tf.TextLineReader() _, line = reader.read(filename_queue) return line line = read_filename_queue(filename_queue) session_conf = tf.ConfigProto(intra_op_parallelism_threads=1500,inter_op_parallelism_threads=1500) with tf.Session(config=session_conf) as sess: sess.run(tf.initialize_local_variables()) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) start = time.time() i = 0 while True: i = i + 1 if i%100000 == 0: print(i) print(time.time() - start) try: sess.run([line]) except tf.errors.OutOfRangeError: print('end of file') break print('total number of lines = ' + str(i)) print(time.time() - start)

3条回答

网友

1楼 · 编辑于 2024-04-25 03:37:54

你可以把它分成小文件。并将intra-unu-parallelishm_-threads和inter-_-op-parallelism_-threads设置为0

对于许多系统来说，使用多个进程读取单个原始文本文件并不容易，tensorflow只使用一个线程读取一个文件，因此调整tensorflow线程没有任何帮助。Spark可以用多线程来处理文件，因为它将文件分成块，每个线程读取其块中的每一行内容，而忽略第一个\n之前的字符，因为它们属于最后一个块的最后一行。对于批量数据处理，Spark是较好的选择，而tensorflow更适合机器学习/深度学习任务

网友

2楼 · 编辑于 2024-04-25 03:37:54

https://github.com/linkedin/TonY

对于TonY，您可以提交一个TensorFlow作业，并指定工人的数量以及他们是否需要cpu或gpu。在

当TonY v3运行在多个线性服务器上时： enter image description here

下面是自述文件中如何使用它的示例：

在tony目录中还有一个tony.xml，它包含所有的TonY作业配置。例如：

$ cat tony/tony.xml
<configuration>
  <property>
    <name>tony.worker.instances</name>
    <value>4</value>
  </property>
  <property>
    <name>tony.worker.memory</name>
    <value>4g</value>
  </property>
  <property>
    <name>tony.worker.gpus</name>
    <value>1</value>
  </property>
  <property>
    <name>tony.ps.memory</name>
    <value>3g</value>
  </property>
</configuration>

有关配置的完整列表，请参阅wiki。在

型号代码 ^{pr2}$

然后您可以启动您的作业：

$ java -cp "`hadoop classpath --glob`:tony/*:tony" \
            com.linkedin.tony.cli.ClusterSubmitter \
            -executes src/models/mnist_distributed.py \
            -task_params '--input_dir /path/to/hdfs/input --output_dir /path/to/hdfs/output --steps 2500 --batch_size 64' \
            -python_venv my-venv.zip \
            -python_binary_path Python/bin/python \
            -src_dir src \
            -shell_env LD_LIBRARY_PATH=/usr/java/latest/jre/lib/amd64/server

命令行参数如下： *executes描述培训代码入口点的位置。 *task_params描述将传递到入口点的命令行参数。 *python_venv描述将调用python脚本的本地zip的名称。 *python_binary_path描述python虚拟环境中包含python二进制文件的相对路径，或使用所有工作节点上已安装的python二进制文件的绝对路径。 *src_dir指定本地根目录的名称，该目录包含所有python模型源代码。此目录将被复制到所有工作节点。 *shell_env为将在python worker/ps进程中设置的环境变量指定键值对。在

网友

3楼 · 编辑于 2024-04-25 03:37:54

I am also a beginner working with tensorflow but since you were asking for answers drawing from credible and/or official sources, here is what I found and might help:

从源代码构建和安装
利用队列读取数据
CPU上的预处理
使用NCHW图像数据格式
在GPU上放置共享参数
使用熔融批次标准

注意：上面列出的要点在tensorflow performance guide中有更详细的解释

Another thing you might want to look into is quantization:

这可以解释如何在存储和运行时使用量化来减小模型大小。量化可以提高性能，特别是在移动硬件上。在

相关问题更多 >

编程相关推荐

热门问题

热门文章