怎么会呢tf.数据每一个历元都有数据扩充的输入管道工作吗？

def get_baseline_dataset(filenames, labels, preproc_fn=functools.partial(_augment), threads=5, batch_size=batch_size, shuffle=True): num_x = len(filenames) # Create a dataset from the filenames and labels dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) # Map our preprocessing function to every element in our dataset, taking # advantage of multithreading dataset = dataset.map(_process_pathnames, num_parallel_calls=threads) if preproc_fn.keywords is not None and 'resize' not in preproc_fn.keywords: assert batch_size == 1, "Batching images must be of the same size" dataset = dataset.map(preproc_fn, num_parallel_calls=threads) if shuffle: dataset = dataset.shuffle(num_x) # It's necessary to repeat our data for all epochs dataset = dataset.repeat().batch(batch_size) return dataset

train_ds = get_baseline_dataset(x_train_filenames, y_train_filenames, preproc_fn=tr_preprocessing_fn, batch_size=batch_size)

1条回答

网友

1楼 · 发布于 2024-04-26 13:25:53

我引用了https://cs230-stanford.github.io/tensorflow-input-data中的基本步骤我建议你把这篇文章略读一遍以了解细节。在

““ 总而言之，不同转换的一个好的顺序是：

创建数据集
随机播放（具有足够大的缓冲区大小）
重复
使用多个并行调用映射实际工作（预处理、扩充…）。在
批
预取 ““

这应该是你想要的，因为“增强”在“重复”之后。希望有帮助。在

相关问题更多 >

编程相关推荐

热门问题

热门文章