获取中间层(函数API)的输出,并在子类API中使用它

2024-03-29 10:35:59 发布

您现在位置:Python中文网/ 问答频道 /正文

keras doc中,它表示如果我们想要选择模型的中间层的输出(顺序和功能),我们需要做的一切如下:

model = ...  # create the original model

layer_name = 'my_layer'
intermediate_layer_model = keras.Model(inputs=model.input,
                                       outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model(data)

这里我们得到两个模型,intermediate_layer_model是其父模型的子模型。他们也是独立的。同样,如果我们得到父模型(或基础模型)的中间层的输出特征映射,用它做一些操作,并从这个操作中得到一些输出特征映射,那么我们也可以将输出特征映射归回父模型


input = tf.keras.Input(shape=(size,size,3))
model = tf.keras.applications.DenseNet121(input_tensor = input)

layer_name = "conv1_block1" # for example 
output_feat_maps = SomeOperationLayer()(model.get_layer(layer_name).output)  

# assume, they're able to add up
base = Add()([model.output, output_feat_maps])

# bind all 
imputed_model = tf.keras.Model(inputs=[model.input], outputs=base)

这样我们就有了一个修正模型。使用函数式API非常简单。所有的kerasimagenet模型都是用函数式API编写的(大部分)。在模型子类API中,我们可以使用这些模型。这里我关心的是,如果我们需要这些函数API模型的internalcall函数的中间输出特性映射,该怎么办

class Subclass(tf.keras.Model): 
    def __init__(self, dim):
         super(Subclass, self).__init__()
         self.dim = dim
         self.base = DenseNet121(input_shape=self.dim)

         # building new model with the desired output layer of base model 
         self.mid_layer_model = tf.keras.Model(self.base.inputs, 
                                     self.base.get_layer(layer_name).output)

    def call(self, inputs):
         # forward with base model 
         x = self.base(inputs)

         # forward with mid_layer_model 
         mid_feat = self.mid_layer_model(inputs)

         # do some op with it 
         mid_x = SomeOperationLayer()(mid_feat)
         
         # assume, they're able to add up
         out = tf.keras.layers.add([x, mid_x])

         return out 

问题是,从技术上讲,我们已经联合推出了两款车型。但与构建这样的模型不同,这里我们只是希望基础模型的中间输出特征映射(来自一些输入)以向前的方式,并在其他地方使用它并获得一些输出。像这样

mid_x = SomeOperationLayer()(self.base.get_layer(layer_name).output)

但是它给出了ValueError: Graph disconnected。因此,目前,我们必须基于所需的中间层,从基础模型构建一个新模型。在init方法中,我们定义或创建了新的self.mid_layer_model模型,该模型给出了我们所需的输出特性映射,如下所示:mid_feat = self.mid_layer_model(inputs)。接下来,我们使用mid_faet并执行一些操作,获得一些输出,最后使用tf.keras.layers.add([x, mid_x])添加它们。因此,通过创建具有所需中间输出的新模型,但同时,我们重复相同的操作两次,即基本模型及其子集模型。也许我遗漏了一些明显的东西,请加起来。就是这样!或者我们可以采取一些策略。我在论坛上问过,还没有回应


更新

下面是一个工作示例。假设我们有这样一个自定义层

import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten

class ConvBlock(tf.keras.layers.Layer):
    def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
        super(ConvBlock, self).__init__()
        # conv layer
        self.conv = tf.keras.layers.Conv2D(kernel_num, 
                        kernel_size=kernel_size, 
                        strides=strides, padding=padding)
        # batch norm layer
        self.bn = tf.keras.layers.BatchNormalization()

    def call(self, input_tensor, training=False):
        x = self.conv(input_tensor)
        x = self.bn(x, training=training)
        return tf.nn.relu(x)

我们想把这一层输入一个ImageNet模型,然后构建一个这样的模型

input = tf.keras.Input(shape=(32, 32, 3))
base = DenseNet121(weights=None, input_tensor = input)

# get output feature maps of at certain layer, ie. conv2_block1_0_relu
cb = ConvBlock()(base.get_layer("conv2_block1_0_relu").output)
flat = Flatten()(cb)
dense = Dense(1000)(flat)

# adding up
adding = Add()([base.output, dense])
model = tf.keras.Model(inputs=[base.input], outputs=adding)

from tensorflow.keras.utils import plot_model 
plot_model(model,
           show_shapes=True, show_dtype=True, 
           show_layer_names=True,expand_nested=False)

enter image description here

这里,从输入到层conv2_block1_0_relu的计算是一次性计算的。接下来,如果我们想将这个函数式API转换为子类化API,我们必须首先从基本模型的输入到层conv2_block1_0_relu构建一个模型。像

class ModelWithMidLayer(tf.keras.Model):
    def __init__(self, dim=(32, 32, 3)):
        super().__init__()
        self.dim = dim
        self.base = DenseNet121(input_shape=self.dim, weights=None)
        
        # building sub-model from self.base which gives 
        # desired output feature maps: ie. conv2_block1_0_relu
        self.mid_layer = tf.keras.Model(self.base.inputs,
                                        self.base.get_layer("conv2_block1_0_relu").output)
        
        self.flat = Flatten()
        self.dense = Dense(1000)
        self.add = Add()
        self.cb = ConvBlock()
    
    def call(self, x):
        # forward with base model
        bx = self.base(x)

        # forward with mid layer
        mx = self.mid_layer(x)

        # make same shape or do whatever
        mx = self.dense(self.flat(mx))
        
        # combine
        out = self.add([bx, mx])
        return out
    
    def build_graph(self):
        x = tf.keras.layers.Input(shape=(self.dim))
        return tf.keras.Model(inputs=[x], outputs=self.call(x))

mwml = ModelWithMidLayer()
plot_model(mwml.build_graph(),
           show_shapes=True, show_dtype=True, 
           show_layer_names=True,expand_nested=False)

enter image description here

这里model_1实际上是来自DenseNet的一个子模型,这可能导致整个模型(ModelWithMidLayer)计算相同的操作两次。如果这一观察是正确的,那么这就给我们带来了担忧


Tags: 模型selflayerinputoutputbasegetmodel
1条回答
网友
1楼 · 发布于 2024-03-29 10:35:59

我认为这可能很复杂,但实际上很简单。我们只需要在__init__方法中构建一个具有所需输出层的模型,并在call方法中正常使用它

import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Add
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten

class ConvBlock(tf.keras.layers.Layer):
    def __init__(self, kernel_num=32, kernel_size=(3,3), strides=(1,1), padding='same'):
        super(ConvBlock, self).__init__()
        # conv layer
        self.conv = tf.keras.layers.Conv2D(kernel_num, 
                        kernel_size=kernel_size, 
                        strides=strides, padding=padding)
        # batch norm layer
        self.bn = tf.keras.layers.BatchNormalization()

    def call(self, input_tensor, training=False):
        x = self.conv(input_tensor)
        x = self.bn(x, training=training)
        return tf.nn.relu(x)
class ModelWithMidLayer(tf.keras.Model):
    def __init__(self, dim=(32, 32, 3)):
        super().__init__()
        self.dim = dim
        self.base = DenseNet121(input_shape=self.dim, weights=None)
        
        # building sub-model from self.base which gives 
        # desired output feature maps: ie. conv2_block1_0_relu
        self.mid_layer = tf.keras.Model(
            inputs=[self.base.inputs],
            outputs=[
                     self.base.get_layer("conv2_block1_0_relu").output,
                     self.base.output])
        self.flat = Flatten()
        self.dense = Dense(1000)
        self.add = Add()
        self.cb = ConvBlock()
    
    def call(self, x):
        # forward with base model
        bx = self.mid_layer(x)[1] # output self.base.output
        # forward with mid layer
        mx = self.mid_layer(x)[0] # output base.get_layer("conv2_block1_0_relu").output
        # make same shape or do whatever
        mx = self.dense(self.flat(mx))
        # combine
        out = self.add([bx, mx])
        return out
    
    def build_graph(self):
        x = tf.keras.layers.Input(shape=(self.dim))
        return tf.keras.Model(inputs=[x], outputs=self.call(x))
mwml = ModelWithMidLayer()
tf.keras.utils.plot_model(mwml.build_graph(),
                          show_shapes=True, show_dtype=True, 
                          show_layer_names=True,expand_nested=False)

enter image description here

相关问题 更多 >