tf.nn.depthwise_conv2d太慢了。正常吗？ - 问答 - Python中文网

tf.nn.depthwise_conv2d太慢了。正常吗？

2024-05-16 18:11:42 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我正在尝试arxiv最近的一项工作，名为“Factorized CNN”

主要讨论了空间分离卷积（深度卷积）和信道线性投影（1x1conv）相结合的卷积运算速度。在

this is the figure for their conv layer architecture

我发现我可以用tf.nn.depthwise_conv2d和1x1卷积，或可分离变压器一

以下是我的实现：

#conv filter for depthwise convolution depthwise_filter = tf.get_variable("depth_conv_w", [3,3,64,1], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/9/32))) #conv filter for linear channel projection pointwise_filter = tf.get_variable("point_conv_w", [1,1,64,64], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/1/64))) conv_b = tf.get_variable("conv_b", [64], initializer=tf.constant_initializer(0)) #depthwise convolution, with multiplier 1 conv_tensor = tf.nn.relu(tf.nn.depthwise_conv2d(tensor, depthwise_filter, [1,1,1,1], padding='SAME')) #linear channel projection with 1x1 convolution conv_tensor = tf.nn.bias_add(tf.nn.conv2d(conv_tensor, pointwise_filter, [1,1,1,1], padding='VALID'), conv_b) #residual tensor = tf.add(tensor, conv_tensor)

这应该比原来的3x3x64->；64通道卷积快9倍左右。在

但是，我无法体验到任何绩效改进。在

我必须假设我做得不对，或者tensorflow的实现有问题。在

因为很少有使用depthwise_conv2d的例子，我把这个问题留在这里。在

这种慢速正常吗？还是有什么错误？在

Tags： for get tf random nn filter variable 卷积

2条回答

网友

1楼 · 编辑于 2024-05-16 18:11:42

Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity.

https://arxiv.org/pdf/1803.09926.pdf

网友

2楼 · 编辑于 2024-05-16 18:11:42

目前depthwise conv2d的实现没有充分利用GPU的并行能力，您需要等待将来更快的实现，例如，在caffe中，该内核存在更快的第三方impl https://github.com/yonghenglh6/DepthwiseConvolution

相关问题更多 >

编程相关推荐

热门问题

热门文章