链上负载神经网络的层梯度

2条回答

网友

1楼 · 编辑于 2024-05-16 22:06:23

如果要获得输入图像的.grad，必须用chainer.Variable来包装输入。
但是，VGGLayers.extract()不支持Variable的输入，因此在这种情况下，您应该调用.forward()或其包装函数{}。在

import chainer
from chainer import Variable
from chainer import functions as F
from cv2 import imread
from chainer.links.model.vision import vgg

net = vgg.VGG16Layers(pretrained_model='auto')

# convert raw image (np.ndarray, dtype=uint32) to a batch of Variable(dtype=float32)
img = imread("path/to/image")
img = Variable(vgg.prepare(img))
img = img.reshape((1,) + img.shape)  # (channel, width, height) -> (batch, channel, width, height)

# just call VGG16Layers.forward, which is wrapped by __call__()
prob = net(img)['prob']
intermediate = F.square(prob)
loss = F.sum(intermediate)

# calculate grad
img_grad = chainer.grad([loss], [img])  # returns Variable
print(img_grad.array) # some ndarray

网友

2楼 · 编辑于 2024-05-16 22:06:23

第1点。
不要调用VGGLayers.predict()，这不是用于backprop计算的。
请改用VGGLayers.extract()。在

第2点。
不要将np.square()和np.sum()直接应用于chainer.Variable。
使用F.square()和{}代替chainer.Variable。在

第三点。
使用loss.backward()获得可学习参数的.grad。
使用loss.backward(retain_grad=True)获得所有变量的.grad。（模式2）
使用chainer.grad()获得特定变量的.grad。（图案3）

代码：

import chainer
from chainer import functions as F, links as L
from cv2 import imread

net = L.VGG16Layers(pretrained_model='auto')
img = imread("/path/to/img")
prob = net.extract([img], layers=['prob'])['prob']  # NOT predict, which overrides chainer.config['enable_backprop'] as False
intermediate = F.square(prob)
loss = F.sum(intermediate)

# pattern 1:
loss.backward()
print(net.fc8.W.grad)  # some ndarray
print(intermediate.grad)  # None
###########################################
net.cleargrads()
intermediate.grad = None
prob.grad = None
###########################################

# pattern 2:
loss.backward(retain_grad=True)
print(net.fc8.W.grad)  # some ndarray
print(intermediate.grad)  # some ndarray

###########################################
net.cleargrads()
intermediate.grad = None
prob.grad = None
###########################################

# pattern 3:
print(chainer.grad([loss], [net.fc8.W]))  # some ndarray
print(intermediate.grad)  # None

相关问题更多 >

编程相关推荐

热门问题

热门文章

链上负载神经网络的层梯度

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >