加载图像数据集

网友

1楼 · 编辑于 2024-06-16 12:33:05

我认为可以使用ID标签迭代csv文件来读取图像。例如：

import csv 

csv_path = 'your_csv_path'
images_base_path = 'your_images_path'

images=[]
labels=[] 

with open(csv_path, newline='',encoding="utf8") as csvfile:
      spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
      for row in spamreader:
          # And than you can do like this:
          # images_complete_path = images_base_path +  row[0]
          # images.append(imread(images_complete_path))
          # labels.append(row[1])

然后，您可以获得图像和标签。这只是一个想法，你可以很容易地实现它。希望能有所帮助

网友

2楼 · 编辑于 2024-06-16 12:33:05

您可以使用ImageDataGenerator的flow_from_dataframe方法使用CSV文件加载图像。
代码：

import tensorflow as tf
import pandas as pd

df = pd.read_csv('data/img/new.csv')

# Data augmentation pipeline
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

# Reading files from path in data frame
train_ds = train_datagen.flow_from_dataframe(df,directory = 'data/img/new', x_col = 'filename', y_col = 'label')

数据帧如下所示：

    filename    label
0   Capture.PNG 0

如果您的文件名中只有id。您可以使用apply方法添加jpg扩展

df['id'] = df['id'].apply(lambda x: '{}.jpg'.format(x))

有关ImageDataGenerator提供的一整套数据扩充选项，您可以查看this

有关flow_from_dataframe的完整选项集，您可以查看this

使用这种方法，您不必担心标签不匹配，因为这是一种内置的TensorFlow方法。此外，文件会在必要时加载，这样可以避免主存混乱

对于培训，您可以简单地使用：

model.fit(
        train_ds,
        steps_per_epoch=2000,
        epochs=50,
        validation_data=validation_ds,
        validation_steps=800)

网友

3楼 · 编辑于 2024-06-16 12:33:05

使用os.walk(directory)按字母顺序获取文件名列表
读取csv文件并以与文件名相同的顺序生成带有类标签的labels_list列表。使用
将tf.keras.preprocessing.image_dataset_from_directory()与参数label=labels_list一起使用

这将为您提供一个tf.data.Dataset，您可以将其提供给培训函数

相关问题更多 >

编程相关推荐

热门问题

热门文章

加载图像数据集

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >