Numpy对象数组

11 投票

1 回答

20705 浏览

提问于 2025-04-17 03:45

我最近在创建Numpy对象数组时遇到了一些问题，比如使用下面的代码：

a = np.array([c], dtype=np.object)

这里的c是某个复杂类的实例，在某些情况下，Numpy会尝试访问这个类的一些方法。不过，如果这样做：

a = np.empty((1,), dtype=np.object)
a[0] = c

就能解决这个问题。我很好奇这两者之间的内部区别是什么。为什么在第一个情况下，Numpy会尝试访问c的一些属性或方法呢？

补充说明：为了说明这个问题，这里有一段示例代码：

import numpy as np

class Thing(object):

    def __getitem__(self, item):
        print "in getitem"

    def __len__(self):
        return 1

a = np.array([Thing()], dtype='object')

这段代码会打印出getitem两次。基本上，如果类中有__len__这个方法，就可能会遇到一些意想不到的行为。

numpy 类实例属性访问对象数组方法访问意外行为内部区别

1 个回答

在第一个例子中，a = np.array([c], dtype=np.object)，numpy并不知道你想要的数组的形状。

比如，当你定义

d = range(10)
a = np.array([d])

时，你希望numpy根据d的长度来确定数组的形状。

所以在你的情况下，numpy会尝试查看len(c)是否被定义，如果有的话，它会通过c[i]来访问c的元素。

你可以通过定义一个类来看到这个效果，比如

class X(object):
    def __len__(self): return 10
    def __getitem__(self, i): return "x" * i

然后

print numpy.array([X()], dtype=object)

会产生

[[ x xx xxx xxxx xxxxx xxxxxx xxxxxxx xxxxxxxx xxxxxxxxx]]

相反，在你的第二个例子中

a = np.empty((1,), dtype=np.object)
a[0] = c

那么a的形状已经被确定了。因此，numpy可以直接分配这个对象。

不过，这种情况仅在a是一个向量时才成立。如果它被定义为其他形状，那么方法访问仍然会发生。例如，以下代码仍然会在一个类上调用___getitem__

a = numpy.empty((1, 10), dtype=object)
a[0] = X()
print a

[[ x xx xxx xxxx xxxxx xxxxxx xxxxxxx xxxxxxxx xxxxxxxxx]]

回答于 2025-04-17 由 Python大师

分享举报

Numpy对象数组

1 个回答

撰写回答