PandasHDFStore.create_table_索引不增加select查询速度，寻找更好的搜索方式

1条回答

网友

1楼 · 发布于 2024-04-24 13:46:53

我想当您指定data_columns=True时，您的列已经被索引了。。。在

请看这个演示：

In [39]: df = pd.DataFrame(np.random.randint(0,100,size=(10, 3)), columns=list('ABC'))

In [40]: fn = 'c:/temp/x.h5'

In [41]: store = pd.HDFStore(fn)

In [42]: store.append('table_no_dc', df, format='table')

In [43]: store.append('table_dc', df, format='table', data_columns=True)

In [44]: store.append('table_dc_no_index', df, format='table', data_columns=True, index=False)

未指定data_columns，因此只索引索引：

^{pr2}$

data_columns=True-所有数据列都已编制索引：

In [46]: store.get_storer('table_dc').group.table
Out[46]:
/table_dc/table (Table(10,)) ''
  description := {
  "index": Int64Col(shape=(), dflt=0, pos=0),
  "A": Int32Col(shape=(), dflt=0, pos=1),
  "B": Int32Col(shape=(), dflt=0, pos=2),
  "C": Int32Col(shape=(), dflt=0, pos=3)}
  byteorder := 'little'
  chunkshape := (3276,)
  autoindex := True
  colindexes := {
    "C": Index(6, medium, shuffle, zlib(1)).is_csi=False,
    "A": Index(6, medium, shuffle, zlib(1)).is_csi=False,
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False,
    "B": Index(6, medium, shuffle, zlib(1)).is_csi=False}

data_columns=True, index=False-我们有数据列信息，但没有它们的索引：

In [47]: store.get_storer('table_dc_no_index').group.table
Out[47]:
/table_dc_no_index/table (Table(10,)) ''
  description := {
  "index": Int64Col(shape=(), dflt=0, pos=0),
  "A": Int32Col(shape=(), dflt=0, pos=1),
  "B": Int32Col(shape=(), dflt=0, pos=2),
  "C": Int32Col(shape=(), dflt=0, pos=3)}
  byteorder := 'little'
  chunkshape := (3276,)

colindexes-显示以上示例中的索引列列表

相关问题更多 >

编程相关推荐

热门问题

热门文章

PandasHDFStore.create_table_索引不增加select查询速度，寻找更好的搜索方式

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >