调整大小numpy.memmap数组

a = np.arange(10) a.resize(20) a >>> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) a = np.memmap('bla.bin', dtype=int) a >>> memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) a.resize(20, refcheck=False) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-41-f1546111a7a1> in <module>() ----> 1 a.resize(20, refcheck=False) ValueError: cannot resize this array: it does not own its data

2条回答

网友

1楼 · 编辑于 2024-05-14 09:00:28

问题是创建数组时，OWNDATA标志为False。您可以通过在创建数组时要求标志为True来更改：

>>> a = np.require(np.memmap('bla.bin', dtype=int), requirements=['O'])
>>> a.shape
(10,)
>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.resize(20, refcheck=False)
>>> a.shape
(20,)

唯一需要注意的是，它可能会创建数组并复制一个副本，以确保满足要求。在

编辑到地址保存：

如果要将重新调整大小的数组保存到磁盘，可以将memmap保存为.npy格式的文件，并在需要重新打开它并用作memmap时以numpy.memmap的形式打开：

^{pr2}$

编辑以提供另一种方法：

通过重新调整基本mmap的大小（a.base或.u mmap，以uint8格式存储）并“重新加载”memmap，您可以接近您所要查找的内容：

>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>> a[3] = 7
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
>>> a.flush()
>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
>>> a.base.resize(20*8)
>>> a.flush()
>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

网友

2楼 · 编辑于 2024-05-14 09:00:28

如果我没弄错的话，这基本上实现了@wwwslinger的第二个解决方案所做的，但不必手动指定新memmap的大小（以位为单位）：

In [1]: a = np.memmap('bla.bin', mode='w+', dtype=int, shape=(10,))

In [2]: a[3] = 7

In [3]: a
Out[3]: memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])

In [4]: a.flush()

# this will append to the original file as much as is necessary to satisfy
# the new shape requirement, given the specified dtype
In [5]: new_a = np.memmap('bla.bin', mode='r+', dtype=int, shape=(20,))

In [6]: new_a
Out[6]: memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [7]: a[-1] = 10

In [8]: a
Out[8]: memmap([ 0,  0,  0,  7,  0,  0,  0,  0,  0, 10])

In [9]: a.flush()

In [11]: new_a
Out[11]: 
memmap([ 0,  0,  0,  7,  0,  0,  0,  0,  0, 10,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0])

当新数组需要比旧数组大时，这种方法很有效，但我不认为这种方法会允许内存映射文件的大小在新数组较小时自动截断。在

像@wwwslinger的回答一样，手动调整基的大小似乎允许文件被截断，但这不会减少数组的大小。在

例如：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章