Python字典转换为numpy结构化数组

2024-05-12 20:40:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一本字典需要转换成NumPy结构的数组。我使用的是arcpy函数^{},因此NumPy结构的数组是唯一可以工作的数据格式。

基于此线程:Writing to numpy array from dictionary和此线程:How to convert Python dictionary object to numpy array

我试过这个:

result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}

names = ['id','data']
formats = ['f8','f8']
dtype = dict(names = names, formats=formats)
array=numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)

但我总是得到expected a readable buffer object

下面的方法可以工作,但很愚蠢,显然不能用于实际数据。我知道有一个更优雅的方法,我只是想不通。

totable = numpy.array([[key,val] for (key,val) in result.iteritems()])
array=numpy.array([(totable[0,0],totable[0,1]),(totable[1,0],totable[1,1])],dtype)

Tags: tokeynumpydictionarynamesval数组result
3条回答

您可以使用np.array(list(result.items()), dtype=dtype)

import numpy as np
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}

names = ['id','data']
formats = ['f8','f8']
dtype = dict(names = names, formats=formats)
array = np.array(list(result.items()), dtype=dtype)

print(repr(array))

收益率

array([(0.0, 1.1181753789488595), (1.0, 0.5566080288678394),
       (2.0, 0.4718269778030734), (3.0, 0.48716683119447185), (4.0, 1.0),
       (5.0, 0.1395076201641266), (6.0, 0.20941558441558442)], 
      dtype=[('id', '<f8'), ('data', '<f8')])

如果不想创建元组的中间列表list(result.items()),则可以使用np.fromiter

在Python2中:

array = np.fromiter(result.iteritems(), dtype=dtype, count=len(result))

在Python3中:

array = np.fromiter(result.items(), dtype=dtype, count=len(result))

为什么使用列表{}不起作用:

顺便说一下,你的尝试

numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)

离工作很近。如果将列表[key, val]更改为元组(key, val),那么它就可以工作了。当然了

numpy.array([(key,val) for (key,val) in result.iteritems()], dtype)

是同一件事

numpy.array(result.items(), dtype)

在Python2中,或者

numpy.array(list(result.items()), dtype)

在Python3。


np.array对待列表的方式不同于元组:Robert Kern explains

As a rule, tuples are considered "scalar" records and lists are recursed upon. This rule helps numpy.array() figure out which sequences are records and which are other sequences to be recursed upon; i.e. which sequences create another dimension and which are the atomic elements.

因为(0.0, 1.1181753789488595)被认为是这些原子元素之一,所以它应该是元组,而不是列表。

当单词的值是相同长度的列表时,让我提出一种改进的方法:

import numpy

def dctToNdarray (dd, szFormat = 'f8'):
    '''
    Convert a 'rectangular' dictionnary to numpy NdArray
    entry 
        dd : dictionnary (same len of list 
    retrun
        data : numpy NdArray 
    '''
    names = dd.keys()
    firstKey = dd.keys()[0]
    formats = [szFormat]*len(names)
    dtype = dict(names = names, formats=formats)
    values = [tuple(dd[k][0] for k in dd.keys())]
    data = numpy.array(values, dtype=dtype)
    for i in range(1,len(dd[firstKey])) :
        values = [tuple(dd[k][i] for k in dd.keys())]
        data_tmp = numpy.array(values, dtype=dtype)
        data = numpy.concatenate((data,data_tmp))
    return data

dd = {'a':[1,2.05,25.48],'b':[2,1.07,9],'c':[3,3.01,6.14]}
data = dctToNdarray(dd)
print data.dtype.names
print data

更简单的是,如果你接受使用熊猫:

import pandas
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
df = pandas.DataFrame(result, index=[0])
print df

给出:

          0         1         2         3  4         5         6
0  1.118175  0.556608  0.471827  0.487167  1  0.139508  0.209416

相关问题 更多 >