结合主干模型和手工特征的混合深度学习模型

0 投票
1 回答
43 浏览
提问于 2025-04-14 18:09

我有一些RGB图像,想要建立一个回归模型来预测“住宿评分”,这个模型会结合densenet121作为基础结构和一个CSV文件中的手工特征。运行下面的代码时,我遇到了一个错误,提示ValueError: Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=float32>]。如果你能帮我一下,我已经为这个问题挣扎了好几天。

#Step 1: Import the required libraries  
import tensorflow as tf
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.layers import Dense, Dropout, Input, Concatenate, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

modelID = 'd121_HCF'

#Step 2: Load and preprocess the image data 
image_dir = r'/path_to_images_folder'
annotations_file = '/path_to/annotation.csv'
features_file = 'handcrafted_features.csv'

# Load image filenames and labels from annotations file
annotations_df = pd.read_csv(annotations_file)

image_filenames = annotations_df['Image_filename'].tolist()
labels = annotations_df['Lodging_score'].tolist()

# Load handcrafted features
features_df = pd.read_csv(features_file)
features_df.set_index('Image_filename', inplace=True)

# Get common image filenames
common_filenames = list(set(image_filenames).intersection(features_df.index))
#print(len(common_filenames))

# Filter the annotation and feature dataframes based on common filenames
annotations_df = annotations_df[annotations_df['Image_filename'].isin(common_filenames)]
features_df = features_df.loc[common_filenames]
features_df = features_df.drop(columns=['plot_id','project_id','Lodging_score'])# dropping columns that are not features

# Split the data into train, val, and test sets
train_filenames, test_filenames, train_labels, test_labels = train_test_split(
    annotations_df['Image_filename'].tolist(),
    annotations_df['Lodging_score'].tolist(),
    test_size=0.2,
    random_state=42)

val_filenames, test_filenames, val_labels, test_labels = train_test_split(
    test_filenames,
    test_labels,
    test_size=0.5,
    random_state=42)

# Preprocess handcrafted features
train_features = features_df.loc[train_filenames].values
val_features = features_df.loc[val_filenames].values
test_features = features_df.loc[test_filenames].values

# Normalize handcrafted features
train_features = (train_features - train_features.mean(axis=0)) / train_features.std(axis=0)
val_features = (val_features - train_features.mean(axis=0)) / train_features.std(axis=0)
test_features = (test_features - train_features.mean(axis=0)) / train_features.std(axis=0)

# Convert the label arrays to numpy arrays
train_labels = np.array(train_labels)
val_labels = np.array(val_labels)
test_labels = np.array(test_labels)

# Preprocess handcrafted features
train_features = train_features[:len(train_filenames)]
val_features = val_features[:len(val_filenames)]
test_features = test_features[:len(test_filenames)]

# Define image data generator with augmentations
image_size = (75, 200)
batch_size = 32

image_data_generator = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True)

train_data = pd.DataFrame({'filename': train_filenames, 'Lodging_score': train_labels})
train_generator = image_data_generator.flow_from_dataframe(
    train_data,
    directory=image_dir,
    x_col='filename',
    y_col='Lodging_score',
    target_size=image_size,
    batch_size=batch_size,
    class_mode='raw',
    shuffle=False)

val_generator = image_data_generator.flow_from_dataframe(
    pd.DataFrame({'filename': val_filenames, 'Lodging_score': val_labels}),
    directory=image_dir,
    x_col='filename',
    y_col='Lodging_score',
    target_size=image_size,
    batch_size=batch_size,
    class_mode='raw',
    shuffle=False)

# Create test generator
test_generator = image_data_generator.flow_from_dataframe(
    pd.DataFrame({'filename': test_filenames, 'Lodging_score': test_labels}),
    directory=image_dir,
    x_col='filename',
    y_col='Lodging_score',
    target_size=image_size,
    batch_size=batch_size,  # Keep the batch size the same as the other generators
    class_mode='raw',
    shuffle=False)

#Step 3: Build the hybrid regression model
# Load DenseNet121 pre-trained on ImageNet without the top layer
base_model = DenseNet121(include_top=False, weights='imagenet', input_shape=image_size + (3,))

# Freeze the base model's layers
base_model.trainable = False

# Input layers for image data and handcrafted features
image_input = Input(shape=image_size + (3,))
features_input = Input(shape=(train_features.shape[1],))

# Preprocess image input for DenseNet121
image_preprocessed = tf.keras.applications.densenet.preprocess_input(image_input)

# Extract features from the base model
base_features = base_model(image_preprocessed, training=False)
base_features = GlobalAveragePooling2D()(base_features)

# Combine base model features with handcrafted features
combined_features = Concatenate()([base_features, features_input])

# Add dense layers for regression
x = Dropout(0.5)(combined_features)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(1, activation='linear')(x)

# Create the model
model = Model(inputs=[image_input, features_input], outputs=output)

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error')

#Step 4: Train the model with early stopping   
# Define early stopping callback
early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss', patience=5, restore_best_weights=True)

# Convert numpy arrays to tensors
train_features_tensor = tf.convert_to_tensor(train_features, dtype=tf.float32)
val_features_tensor = tf.convert_to_tensor(val_features, dtype=tf.float32)
test_features_tensor = tf.convert_to_tensor(test_features, dtype=tf.float32)

# Train the model
history = model.fit(
    train_generator,
    steps_per_epoch=len(train_generator),
    epochs=50,
    validation_data=([val_generator.next()[0], val_features], val_labels),
    validation_steps=len(val_generator),
    callbacks=[early_stopping])

# Evaluate the model on the test set
loss = model.evaluate([test_generator.next()[0], test_features], test_labels, verbose=0)
predictions = model.predict([test_generator.next()[0], test_features])

1 个回答

1

看看 tf.Dataset这里可以看到如何从数据框(DataFrame)中读取数据。然后,你可以用 Dataset.map() 方法对每个元素进行预处理,或者使用像 这个指南 中的预处理层。你也可以用这些层来进行数据增强。

因为你有两个输入层,所以你的数据也必须包含这些输入。示例代码:

import numpy as np
import pandas as pd
import tensorflow as tf


img_paths = ['test'] * 100  # 100 image paths here
rand_features = np.random.rand(100, 3)  # random features
rand_labels = np.random.randint(0, 10, size=(100, 1))  # random labels as int

#right now, the dataset has a 3-tuple as samples
ds = tf.data.Dataset.from_tensor_slices((img_paths, rand_features, rand_labels))
ds = ds.map(lambda x, y, z: ((tf.image.decode_image(tf.io.read_file(x)), y), z))
ds = ds.batch(32)  # batch the data

ds.map(...) 的调用中,它会从 image_path 字符串中读取一张图片,并将 (image, feature, label) 这个三元组转换为 ((image, feature), label) 这个嵌套的二元组。现在,model.fit 可以将每个 (image, feature) 作为两个输入(x),而 label 作为(y)。把这个放在你从数据框读取图片和特征之后。你也可以看看其他有用的 Dataset 方法,比如 .prefetch().shuffle()

编辑:
Dataset 融入你的代码中。我省略了你代码的第一部分,保持不变就好。我从你的一部分代码开始,展示你该在哪里插入。

# [your previous code here]

# Preprocess handcrafted features
train_features = train_features[:len(train_filenames)]
val_features = val_features[:len(val_filenames)]
test_features = test_features[:len(test_filenames)]

# Define image data generator with augmentations
image_size = (75, 200)
batch_size = 32

# -----------------------------------
# creating the datasets here
train_ds = tf.data.Dataset.from_tensor_slices((train_filenames, train_features, train_labels))
train_ds = ds.map(lambda x, y, z: ((tf.image.decode_image(tf.io.read_file(x)), y), z))
# this utilized datasets fully
train_ds = train_ds.cache().shuffle(10000).batch(batch_size).prefetch(2)
val_ds = tf.data.Dataset.from_tensor_slices((val_filenames, val_features, val_labels))
val_ds = ds.map(lambda x, y, z: ((tf.image.decode_image(tf.io.read_file(x)), y), z))
val_ds = val_ds.cache().batch(batch_size).prefetch(2)
test_ds = tf.data.Dataset.from_tensor_slices((test_filenames, test_features, test_labels))
test_ds = test_ds.map(lambda x, y, z: ((tf.image.decode_image(tf.io.read_file(x)), y), z))
test_ds = test_ds.batch(batch_size).prefetch(2)

augment_ = tf.keras.layers.RandomRotation(20)(
               tf.keras.layers.RandomTranslation(height_factor=0.1, width_factor0.1)(
                   tf.keras.layers.RandomFlip(mode='horizontal')
              )
           )

#Step 3: Build the hybrid regression model
# Load DenseNet121 pre-trained on ImageNet without the top layer
base_model = DenseNet121(include_top=False, weights='imagenet', input_shape=image_size + (3,))

# Freeze the base model's layers
base_model.trainable = False

# Input layers for image data and handcrafted features
image_input = Input(shape=image_size + (3,))
features_input = Input(shape=(train_features.shape[1],))

# Preprocess image input for DenseNet121
image_preprocessed = tf.keras.applications.densenet.preprocess_input(image_input)
image_preprocessed = augment_(image_preprocessed)
# --------------------------

你不需要(或者说最好不要)对图片做 /255. 的操作,因为 densenet.preprocess_input 期望输入的范围是 [0, 255]。现在的随机增强是通过 ImageDataGenerator 作为层来实现的,并用 augment_() 调用。这种方法比你之前的方式更好,因为增强层在 valtest 数据上是关闭的。

关于 train_ds.cache().shuffle(10000).batch(batch_size).prefetch(2) 的说明:
这些方法提高了 Dataset 类的效率。这里的顺序不是随便的。如果你在缓存之前进行洗牌,洗牌后的数据集会被缓存,之后就不会再洗牌了。你还想在 batch() 之前进行 shuffle(),否则你洗牌的是批次,但每个批次中的项目是固定的。prefetch(x) 会预加载 x 个项目,以加快训练速度。如果你把它放在 batch() 之前,它只会加载 x 个样本;放在 batch() 之后,它会预加载 x 个批次的样本(这正是我们想要的)。valtest 集不需要洗牌,因为它们不需要。test 也不缓存,因为缓存开销大,但在后续运行时会更快,不过通常测试只运行一次。

另外,请注意,在 DenseNet121 的文档中提到,它期望输入的图片形状是 (224, 224, 3),所以请测试一下它是否能正确处理你的图片。此外,我没有测试代码,因为缺少数据。

撰写回答