使用PyTables合并两个HDF5文件
ptrepack 几乎满足我的需求,但它只能选择覆盖或忽略重复的路径。下面的例子展示了我希望在结构中发生的事情。
输入文件一
/ (RootGroup) ''
/data_set_name (Group) 'group of images files'
/data_set_name_Set (EArray(7913, 128)) ''
/data_set_name/image_set_index (Table(3,)) ''
/data_set_name/i100 (Group) 'sift features and coordinates'
/data_set_name/i100/descriptors (Array(7913, 128)) 'sift descriptors'
/data_set_name/i100/locations (Array(7913, 4)) 'sift feature locations'
输入文件二
/ (RootGroup) ''
/data_set_name (Group) 'group of images files'
/data_set_name_Set (EArray(4328, 128)) ''
/data_set_name/image_set_index (Table(4,)) ''
/data_set_name/i1156 (Group) 'sift features and coordinates'
/data_set_name/i1156/descriptors (Array(4328, 128)) 'sift descriptors'
/data_set_name/i1156/locations (Array(4328, 4)) 'sift feature locations'
期望的输出
/ (RootGroup) ''
/data_set_name (Group) 'group of images files'
/data_set_name_Set (EArray(12241, 128)) ''
/data_set_name/image_set_index (Table(7,)) ''
/data_set_name/i100 (Group) 'sift features and coordinates'
/data_set_name/i100/descriptors (Array(7913, 128)) 'sift descriptors'
/data_set_name/i100/locations (Array(7913, 4)) 'sift feature locations'
/data_set_name/i1156 (Group) 'sift features and coordinates'
/data_set_name/i1156/descriptors (Array(4328, 128)) 'sift descriptors'
/data_set_name/i1156/locations (Array(4328, 4)) 'sift feature locations'
有什么高效的方法可以实现这个吗?
1 个回答
1
你是说想要自动扩大那些路径相同的数据集吗?嗯,我之前没想过这个,不过听起来是个不错的功能(不过这个功能只适用于可以扩大的数组)。我已经记录下来了: