更新PyCharm中虚拟环境中的Tensorflow二进制文件以使用AVX2

2条回答

网友

1楼 · 编辑于 2024-05-23 22:40:08

Anaconda/conda作为包管理工具：

假设您已经在您的机器上安装了anaconda/conda，如果没有安装，请遵循以下命令-https://docs.anaconda.com/anaconda/install/windows/

conda create  name tensorflow_optimized python=3.7
conda activate tensorflow_optimized

# you need intel's tensorflow version that's optimized to use SSE4.1 SSE4.2 AVX AVX2 FMA
conda install tensorflow-mkl -c anaconda

#run this to check if the installed version is using MKL, 
#which in turns uses all the optimizations that your system provide. 
python -c "import tensorflow as tf; tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)"

# you should see something like this as the output.
2020-07-14 19:19:43.059486: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.

pip3作为包管理工具：

py -m venv tensorflow_optimized
.\tensorflow_optimized\Scripts\activate

#once the env is activated, you need intel's tensorflow version 
#that's optimized to use SSE4.1 SSE4.2 AVX AVX2 FMA
pip install intel-tensorflow

#run this to check if the installed version is using MKL, 
#which in turns uses all the optimizations that your system provide. 
py -c "import tensorflow as tf; tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)"

# you should see something like this as the output.
2020-07-14 19:19:43.059486: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.

一旦你有了这个，你可以在pycharm中设置使用这个环境

在那之前，快跑 ^当env被激活时，windows上的{}，Linux和Mac上的which python应该为您提供解释器的路径。在Pycharm中，转到首选项->；项目：您的项目名称->；项目口译员->；单击设置符号->；点击添加

选择系统解释器->；点击-&燃气轮机；这将打开一个弹出窗口，询问python解释器的位置

在位置路径中，粘贴来自where python->；单击“确定”

现在您应该看到该环境中安装的所有软件包。

从下一次开始，如果您想为您的项目选择该解释器，请单击右下角的python3/python2（您的解释器名称），然后选择您需要的解释器

我建议您安装Anaconda作为默认的包管理器，因为在Windows机器上使用python可以使开发人员的工作更轻松，但您也可以使用pip

网友

2楼 · 编辑于 2024-05-23 22:40:08

如果您在训练期间的CPU利用率大部分时间都保持在100%以下，那么您甚至不应该麻烦获得不同的TF二进制文件

根据您正在运行的工作负载，您可能看不到使用AVX2（或AVX512）的任何好处

AVX2是一组大小为256（位）的CPU向量指令。与128位流式指令相比，您最多可以获得2倍的好处。当涉及到深度学习模型时，它们受到很大的内存带宽限制，如果切换到更大的寄存器大小，也看不到多少好处。检查它的简单方法：看看在训练期间CPU利用率保持在100%有多长时间。如果大部分时间它低于100%，那么你可能已经被内存（或其他方面）限制住了。如果您的培训是在GPU上运行的，并且CPU仅用于数据预处理和偶尔的操作，那么好处就更不明显了

回到回答你的问题。更新TF二进制文件以充分利用最新CPU体系结构、CUDA版本、python版本等的最佳方法是build tensorflow from source。这可能会占用你几个小时的时间。这将是解决您的问题的官方和最有力的方式

如果您对使用更好的CPU指令感到满意，您可以尝试在任何可以找到的地方安装不同的第三方二进制文件。安装Conda并将pycharm解释器指向Conda安装将是选项之一

相关问题更多 >

编程相关推荐

热门问题

热门文章