对numpy混合类型矩阵排序

0 投票

1 回答

546 浏览

提问于 2025-04-17 15:45

我想对这种类型的（比较宽的）矩阵进行排序，这种矩阵每一列的类型是相同的，但每一列的类型可能不同。排序的方式是让每一行的所有列保持在一起，同时行的顺序要根据某一列的值来排列。

[ 
[1, 0, 0.25,'ind1', 'pop2', 0.56],
[2, 0, 0.35,'ind2', 'pop2', 0.58],
[1, 0, 0.23,'ind1', 'pop1', 0.66],
...
]

在这里，我是通过第二列（浮点数列）来进行排序的。

[ 
[1, 0, 0.23,'ind1', 'pop1', 0.66],
[1, 0, 0.25,'ind1', 'pop2', 0.56],
[2, 0, 0.35,'ind2', 'pop2', 0.58],
...
]

如果列里面包含字符类型，排序会有什么变化吗？感谢你的帮助和建议，我试过了lexsort、sort、argsort……但可能方法不对。更新一下：我不知道为什么，如果我的矩阵是用numpy.matrix()定义的，argsort()方法会增加一个维度（结果变成三维），而如果用numpy.array()定义就不会出现这种情况。如果这能帮助到其他读者的话。

数据结构 numpy 数据排序 lexsort argsort 维度处理矩阵排序混合类型

1 个回答

如果你的数据类型有命名的字段，你可以使用numpy.sort，并在“order”参数中指定你想要排序的字段名：

import numpy 

fieldTypes = ['i4', 'i4', 'f8', 'S4', 'S4', 'f8'] # data types of each field
fieldNames = ['a', 'b', 'c', 'd', 'e', 'f'] # names of the fields, feel free to give more descriptive names

myType = numpy.dtype(zip(fieldNames, fieldTypes)) # Create a numpy data type based on the types and fields

a = numpy.array([(1, 0, 0.25,'ind1', 'pop2', 0.56),
(2, 0, 0.35,'ind2', 'pop2', 0.58),
(1, 0, 0.23,'ind1', 'pop1', 0.66)], dtype=myType) # Create the array with the right dtype

print numpy.sort(a, order=['c']) # sort based on column 'c'

需要注意的是，如果你正在创建一个空的缓冲区，或者从现有的文件/缓冲区加载numpy数据，你仍然可以将其转换为带有命名字段的数据类型。

如果你没有命名字段，这个回答可能会对你有帮助，我推荐@Steve Tjoa提出的方法：

a[a[:,1].argsort()] # Replace 1 with the index you need

回答于 2025-04-17 由 Python大师

分享举报

对numpy混合类型矩阵排序

1 个回答

撰写回答