在Python中正确设置和读取HDF5文件中的dimscale

0 投票
2 回答
572 浏览
提问于 2025-05-10 15:28

我正在尝试将维度标尺附加到我想用Python存储在hdf5文件中的数据集,但在设置这些标尺后尝试打印属性时遇到了错误。相关的代码片段如下:

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))
x_axis  = np.linspace(0, 1, 100)

h5f = h5.File('my_file.h5','w')
h5f.create_dataset( 'data_1', data=my_data )
h5f['data_1'].dims[0].label = 'm'
h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )

# the following line is creating the problems
h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

# this is where the crash happens but only if the above line is included
for ii in h5f['data_1'].attrs.items():
    print ii

h5f.close()

命令 print(h5.version.info) 输出了以下内容:

Summary of the h5py configuration
---------------------------------

h5py    2.2.1
HDF5    1.8.11
Python  2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2]
sys.platform    linux2
sys.maxsize     9223372036854775807
numpy   1.8.2

错误信息如下:

Traceback (most recent call last):
  File "HDF_write_dimScales.py", line 16
    for ii in h5f['data_1'].attrs.items():
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 347, in items
    return [(x, self.get(x)) for x in self]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 310, in get
    return self[name]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 55, in __getitem__
    rtdt = readtime_dtype(attr.dtype, [])
  File "h5a.pyx", line 318, in h5py.h5a.AttrID.dtype.__get__ (h5py/h5a.c:4285)
  File "h5t.pyx", line 337, in h5py.h5t.TypeID.py_dtype (h5py/h5t.c:3892)
TypeError: No NumPy equivalent for TypeVlenID exists

任何想法或提示都非常感谢。

相关文章:

  • 暂无相关问题
暂无标签

2 个回答

0

这只是个猜测,但因为错误提到了 TypeVlenID,所以可能和 h5pyvlen 的实现不完整有关(特别是在我们使用的这个模块版本中)。

使用 vlen 和 h5py 时出现无法解释的行为

通过 h5py (HDF5) 向复合数据集写入可变长度字符串

1

在我这里,稍微调整一下就能在 h5py 2.5.0 上正常工作。问题可能出在你调用 create_scale 的时机上。在 h5py 2.5.0 中,我在你的 create_scale() 调用里遇到了 KeyError,指的是 h5f['x_axis']。为了让你的例子能够正常运行,我必须先明确地创建 x_axis 数据集。

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))

# Use a context manager to ensure h5f is closed
with h5py.File('my_file.h5','w') as h5f:
    h5f.create_dataset( 'data_1', data=my_data )

    # Create the x_axis dataset directly in the HDF5 file
    h5f['x_axis']  = np.linspace(0, 1, 100)

    h5f['data_1'].dims[0].label = 'm'

    # Now we can create and attach the scale without problems
    h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )
    h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

    for ii in h5f['data_1'].attrs.items():
        print(ii)

# Output
#(u'DIMENSION_LABELS', array(['m', ''], dtype=object))
#(u'DIMENSION_LIST', array([array([<HDF5 object reference>], dtype=object),
#       array([], dtype=object)], dtype=object))

如果你仍然遇到问题,可能需要升级到 h5py 2.5.0,因为这个版本对 VLEN 类型的处理更好(虽然还是不完美)。

撰写回答