.dtype做什么？

3条回答

网友

1楼 · 编辑于 2024-04-20 02:06:39

通过以这种方式更改数据类型，可以更改解释固定内存块的方式。

示例：

>>> import numpy as np
>>> a=np.array([1,0,0,0,0,0,0,0],dtype='int8')
>>> a
array([1, 0, 0, 0, 0, 0, 0, 0], dtype=int8)
>>> a.dtype='int64'
>>> a
array([1])

请注意，从int8到int64的更改如何将8位整数数组中的8个元素更改为64位数组中的1个元素。但它是相同的8字节块。在具有本机endianess的i7计算机上，字节模式与int64格式的1相同。

更改1的位置：

>>> a=np.array([0,0,0,1,0,0,0,0],dtype='int8')
>>> a.dtype='int64'
>>> a
array([16777216])

另一个例子：

>>> a=np.array([0,0,0,0,0,0,1,0],dtype='int32')
>>> a.dtype='int64'
>>> a
array([0, 0, 0, 1])

更改32字节32位数组中1的位置：

>>> a=np.array([0,0,0,1,0,0,0,0],dtype='int32')
>>> a.dtype='int64'
>>> a
array([         0, 4294967296,          0,          0])

这是重新解释的同一块位。

网友

2楼 · 编辑于 2024-04-20 02:06:39

在搞乱它之后，我认为手动分配dtype会重新解释转换，而不是您想要的。这意味着我认为它将数据直接解释为一个浮点数，而不是将其转换为一个浮点数。也许你可以试试aa = numpy.array(aa.map(float, aa))。

进一步说明：dtype是数据的类型。逐字引用documentation

A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted.

int和float没有相同的位模式，这意味着您不能只查看内存中的int，当您将其视为float时，它将是相同的数字。通过将dtype设置为float64，您只需告诉计算机将该内存读取为float64，而不是实际将整数转换为浮点数。

网友

3楼 · 编辑于 2024-04-20 02:06:39

首先，你正在学习的代码是有缺陷的。几乎可以肯定的是，它并不像原始作者基于代码中的注释所想的那样。

作者的意思可能是：

def to_1d(array):
    """prepares an array into a 1d real vector"""
    return array.astype(np.float64).ravel()

但是，如果array总是一个复数数组，那么原始代码就有一定的意义。

查看数组（a.dtype = 'float64'等同于执行a = a.view('float64')）的唯一情况是，如果它是一个复杂数组（numpy.complex128）或128位浮点数组，那么它的大小将增加一倍。对于任何其他数据类型，这都没有多大意义。

对于复杂数组的特定情况，原始代码会将np.array([0.5+1j, 9.0+1.33j])之类的内容转换为np.array([0.5, 1.0, 9.0, 1.33])。

一种更简洁的写作方法是：

def complex_to_iterleaved_real(array):
     """prepares a complex array into an "interleaved" 1d real vector"""
    return array.copy().view('float64').ravel()

（我暂时忽略了有关返回原始数据类型和形状的部分。）

numpy数组的背景

要解释这里发生的事情，您需要了解一下numpy数组是什么。

numpy数组由一个“原始”内存缓冲区组成，该缓冲区通过“视图”解释为一个数组。您可以将所有numpy数组看作视图。

从numpy的意义上讲，视图只是一种不同的方式，可以在不复制的情况下对同一个内存缓冲区进行切片和切割。

视图具有形状、数据类型（dtype）、偏移量和跨距。在可能的情况下，对numpy数组的索引/整形操作将只返回原始内存缓冲区的视图。

这意味着像y = x.T或y = x[::2]这样的东西不使用任何额外的内存，也不复制x。

所以，如果我们有一个类似的数组：

import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])

我们可以通过以下两种方式重塑它：

x = x.reshape((2, 5))

或者

x.shape = (2, 5)

为了可读性，第一个选项更好。不过，它们（几乎）完全相同。任何一个副本都不会占用更多的内存（第一个副本将生成一个新的python对象，但目前这还不重要）。

数据类型和视图

同样的事情也适用于dtype。我们可以通过设置x.dtype或通过调用x.view(...)将数组视为不同的数据类型。

所以我们可以这样做：

import numpy as np
x = np.array([1,2,3], dtype=np.int)

print 'The original array'
print x

print '\n...Viewed as unsigned 8-bit integers (notice the length change!)'
y = x.view(np.uint8)
print y

print '\n...Doing the same thing by setting the dtype'
x.dtype = np.uint8
print x

print '\n...And we can set the dtype again and go back to the original.'
x.dtype = np.int
print x

结果是：

The original array
[1 2 3]

...Viewed as unsigned 8-bit integers (notice the length change!)
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...Doing the same thing by setting the dtype
[1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0]

...And we can set the dtype again and go back to the original.
[1 2 3]

不过，请记住，这给了您对内存缓冲区解释方式的低级控制。

例如：

import numpy as np
x = np.arange(10, dtype=np.int)

print 'An integer array:', x
print 'But if we view it as a float:', x.view(np.float)
print "...It's probably not what we expected..."

这将产生：

An integer array: [0 1 2 3 4 5 6 7 8 9]
But if we view it as a float: [  0.00000000e+000   4.94065646e-324   
   9.88131292e-324   1.48219694e-323   1.97626258e-323   
   2.47032823e-323   2.96439388e-323   3.45845952e-323
   3.95252517e-323   4.44659081e-323]
...It's probably not what we expected...

因此，在本例中，我们将原始内存缓冲区的底层位解释为浮点。

如果我们想制作一个新的副本，其中的int被重设为float，我们将使用x.astype（np.float）。

复数

复数作为两个浮点数存储（在C、python和numpy中）。第一个是实部，第二个是虚部。

所以，如果我们这样做了：

import numpy as np
x = np.array([0.5+1j, 1.0+2j, 3.0+0j])

我们可以看到实部（x.real）和虚部（x.imag）。如果我们把它转换成一个浮点数，我们会得到一个关于丢弃虚部分的警告，我们会得到一个只有实部分的数组。

print x.real
print x.astype(float)

astype复制并将值转换为新类型。

但是，如果我们将这个数组看作一个浮点数，我们将得到一个item1.real, item1.imag, item2.real, item2.imag, ...序列。

print x
print x.view(float)

收益率：

[ 0.5+1.j  1.0+2.j  3.0+0.j]
[ 0.5  1.   1.   2.   3.   0. ]

每个复数本质上是两个浮点数，因此如果我们改变numpy解释底层内存缓冲区的方式，我们将得到一个长度为其两倍的数组。

希望这有助于澄清一些事情。。。

numpy数组的背景

数据类型和视图

复数

相关问题更多 >

编程相关推荐

热门问题

热门文章

.dtype做什么？

numpy数组的背景

数据类型和视图

复数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >