命名数据类型数组:在[0]['name']和['name'][0]之间的区别?

2024-04-19 23:54:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我在numpy中遇到了以下奇怪的东西,可能是也可能不是bug:

import numpy as np
dt = np.dtype([('tuple', (int, 2))])
a = np.zeros(3, dt)
type(a['tuple'][0])  # ndarray
type(a[0]['tuple'])  # ndarray

a['tuple'][0] = (1,2)  # ok
a[0]['tuple'] = (1,2)  # ValueError: shape-mismatch on array construction

我本以为下面的两个选项都有效。 意见?在


Tags: importnumpyastypenpdtzerosok
3条回答

我是在numpy讨论列表上问的。特拉维斯·奥列芬特回答here。在

引用他的回答:

The short answer is that this is not really a "normal" bug, but it could be considered a "design" bug (although the issues may not be straightforward to resolve). What that means is that it may not be changed in the short term - and you should just use the first spelling.

Structured arrays can be a confusing area of NumPy for several of reasons. You've constructed an example that touches on several of them. You have a data-type that is a "structure" array with one member ("tuple"). That member contains a 2-vector of integers.

First of all, it is important to remember that with Python, doing

a['tuple'][0] = (1,2)

is equivalent to

b = a['tuple']; b[0] = (1,2)

In like manner,

a[0]['tuple'] = (1,2)

is equivalent to

b = a[0]; b['tuple'] = (1,2)

To understand the behavior, we need to dissect both code paths and what happens. You built a (3,) array of those elements in 'a'. When you write b = a['tuple'] you should probably be getting a (3,) array of (2,)-integers, but as there is currently no formal dtype support for (n,)-integers as a general dtype in NumPy, you get back a (3,2) array of integers which is the closest thing that NumPy can give you. Setting the [0] row of this object via

a['tuple'][0] = (1,2)

works just fine and does what you would expect.

On the other hand, when you type:

b = a[0]

you are getting back an array-scalar which is a particularly interesting kind of array scalar that can hold records. This new object is formally of type numpy.void and it holds a "scalar representation" of anything that fits under the "VOID" basic dtype.

For some reason:

b['tuple'] = [1,2]

is not working. On my system I'm getting a different error: TypeError: object of type 'int' has no len()

I think this should be filed as a bug on the issue tracker which is for the time being here: http://projects.scipy.org/numpy

The problem is ultimately the void->copyswap function being called in voidtype_setfields if someone wants to investigate. I think this behavior should work.

对此的解释见a numpy bug report。在

这是一个上游错误,从NumPy PR #5947开始修复,在1.9.3中进行了修复。在

我得到的错误与您不同(使用numpy 1.7.0.dev):

ValueError: setting an array element with a sequence.

所以下面的解释可能对您的系统不正确(或者对我所看到的可能是错误的解释)。在

首先,注意索引structured array的一行会得到一个numpy.void对象(请参见data type docs

^{pr2}$

据我所知,void有点像Python列表,因为它可以保存不同数据类型的对象,这是有意义的,因为结构化数组中的列可以是不同的数据类型。在

如果不是索引,而是切掉第一行,则会得到一个ndarray

print type(a[:1]) # = numpy.ndarray

这类似于Python列表的工作方式:

b = [1, 2, 3]
print b[0] # 1
print b[:1] # [1]

切片返回原始序列的缩短版本,但索引返回一个元素(这里是int;上面是void类型)。在

因此,当您对结构化数组的行进行切片时,应该希望它的行为与原始数组相同(只是行数较少)。继续您的示例,现在可以为第一行的“tuple”列赋值:

a[:1]['tuple'] = (1, 2)

所以,。。。为什么a[0]['tuple'] = (1, 2)不起作用?在

好吧,回想一下a[0]返回一个void对象。所以,当你打电话

a[0]['tuple'] = (1, 2) # this line fails

您将一个tuple分配给void对象的“tuple”元素。注意:尽管您将此索引称为“元组”,但它存储为ndarray

print type(a[0]['tuple']) # = numpy.ndarray

所以,这意味着元组需要转换成ndarray但是,void对象不能强制转换赋值(这只是猜测),因为它可以包含任意的数据类型,所以它不知道要转换到什么类型。要解决这一问题,您可以自己投射输入:

a[0]['tuple'] = np.array((1, 2))

事实上我们得到了不同的错误,这表明上面的一行可能不适合您,因为casting处理的是我收到的错误,而不是您收到的错误。在

附录:

那么,为什么下面的工作呢?在

a[0]['tuple'][:] = (1, 2)

在这里,当您添加[:]时,您将索引到数组中,但是如果没有这个,您将索引到void对象。换句话说,a[0]['tuple'][:]表示“替换存储数组的元素”(由数组处理),a[0]['tuple']表示“替换存储数组”(由void处理)。在

结语:

奇怪的是,访问行(即用0索引)似乎会删除基数组,但它仍然允许您分配给基数组。在

print a['tuple'].base is a # = True
print a[0].base is a # = False
a[0] = ((1, 2),) # `a` is changed

也许void不是真正的数组,所以它没有基数组,。。。但是为什么它有一个base属性?在

相关问题 更多 >