如何按行选择NumPy数组中的元素?
我有一个像这样的数组,使用的是numpy数组。
dd= [[foo 0.567 0.611]
[bar 0.469 0.479]
[noo 0.220 0.269]
[tar 0.480 0.508]
[boo 0.324 0.324]]
我该如何遍历这个数组,选择'foo'并获取0.567和0.611这两个浮点数作为单个值。然后选择'bar'并获取0.469和0.479这两个浮点数作为单个值……
我可以通过使用以下方法将第一个元素作为列表获取:
dv= dd[:,1]
'foo'和'bar'这些元素并不是未知变量,它们是可以变化的。
如果元素在位置[1],我该如何更改?
[[0.567 foo2 0.611]
[0.469 bar2 0.479]
[0.220 noo2 0.269]
[0.480 tar2 0.508]
[0.324 boo2 0.324]]
2 个回答
3
首先,第一组元素的向量是
dv = dd[:,0]
(在Python中,索引是从0开始的)
其次,要遍历这个数组(比如说存储到一个字典里),你可以这样写:
dc = {}
ind = 0 # this corresponds to the column with the names
for row in dd:
dc[row[ind]] = row[1:]
31
你在问题中加了NumPy标签,所以我猜你想用NumPy的语法,而之前的回答没有用到这个。
如果你确实想用NumPy,那么你可能不想在数组里放字符串,否则你还得把浮点数也变成字符串。
你想要的是用NumPy的语法通过行来访问二维数组的元素(并且排除第一列)。
这个语法是:
M[row_index,1:] # selects all but 1st col from row given by 'row_index'
关于你问题中的第二种情况——选择不相邻的列:
M[row_index,[0,2]] # selects 1st & 3rd cols from row given by 'row_index'
你问题中的一个小复杂点是,你想用字符串作为行索引,所以需要去掉这些字符串(这样才能创建一个包含浮点数的二维NumPy数组),用数字行索引替换它们,然后创建一个查找表,把字符串和数字行索引对应起来:
>>> import numpy as NP
>>> # create a look-up table so you can remove the strings from your python nested list,
>>> # which will allow you to represent your data as a 2D NumPy array with dtype=float
>>> keys
['foo', 'bar', 'noo', 'tar', 'boo']
>>> values # 1D index array comprised of one float value for each unique string in 'keys'
array([0., 1., 2., 3., 4.])
>>> LuT = dict(zip(keys, values))
>>> # add an index to data by inserting 'values' array as first column of the data matrix
>>> A = NP.hstack((vals, A))
>>> A
NP.array([ [ 0., .567, .611],
[ 1., .469, .479],
[ 2., .22, .269],
[ 3., .48, .508],
[ 4., .324, .324] ])
>>> # so now to look up an item, by 'key':
>>> # write a small function to perform the look-ups:
>>> def select_row(key):
return A[LuT[key],1:]
>>> select_row('foo')
array([ 0.567, 0.611])
>>> select_row('noo')
array([ 0.22 , 0.269])
你问题中的第二种情况:如果索引列发生变化怎么办?
>>> # e.g., move index to column 1 (as in your Q)
>>> A = NP.roll(A, 1, axis=1)
>>> A
array([[ 0.611, 1. , 0.567],
[ 0.479, 2. , 0.469],
[ 0.269, 3. , 0.22 ],
[ 0.508, 4. , 0.48 ],
[ 0.324, 5. , 0.324]])
>>> # the original function is changed slightly, to select non-adjacent columns:
>>> def select_row2(key):
return A[LuT[key],[0,2]]
>>> select_row2('foo')
array([ 0.611, 0.567])