基于不同属性的动态列表排序

1 投票
2 回答
2240 浏览
提问于 2025-04-18 16:46

我看到过一些关于如何根据固定数量来排序列表的解决方案:根据多个属性排序列表?

其中有一个很不错的排序方法:

s = sorted(s, key = lambda x: (x[1], x[2]))

还有一个使用itemgetter的例子。

不过,我这边的属性数量是变化的,比如有两个属性的情况:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x': 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x': 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y': 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z': 'd2_sort': 38},
    etc.
]

但它也可能是1个、3个或者更多。我不能像这样使用lambda函数或者itemgetter。不过,我在执行时知道维度的数量(虽然每次情况不同)。所以我做了这个(参数设置为2维的例子):

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]

def order_get( a , nr):
    result = []
    for i in range(1, nr+1):
        result.append(a.get('d' + str(i) + '_sort'))
    return result

example_list.sort(key = lambda x: order_get(x, 2)) # for this example hard set to 2

In [82]: example_list
Out[82]: 
[{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
 {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
 {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}]

但这样做真的是最好的方法吗?我指的是1) 符合Python的风格,2) 性能方面怎么样?这是一个常见的问题吗?

2 个回答

1

你可以支持任意数量的排序关键字,只要它们的命名模式是可预测的。

假设你有 d[X]_sortd[Y]_sort,其中 X 和 Y 是整数,并且所有的排序关键字都以 _sort 结尾,排序的关键函数可以这样写:

import re

def arb_kf(d): 
    li=filter(lambda s: s.endswith('_sort'), d) 
    rtr=[tuple(map(int, re.findall(r'([0-9]+)', k) + [d[k]])) for k in li]
    rtr.sort()            
    return rtr

以你的字典列表为例:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]


>>> for d in sorted(example_list, key=arb_kf) :
...     print d  
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}

假设 d[X]_sort 中的整数在某些字典里是不同的,而你想要给较小的数字更多的权重;也就是说,d0_sort 的排序权重比没有较小数字的字典要高。

因为 Python 是按元组的元素来排序的,这一点是成立的:

>>> sorted([(1,99), (1,1,1), (0,50), (1,0,99)])
[(0, 50), (1, 0, 99), (1, 1, 1), (1, 99)]

由于关键函数返回的是一个元组列表,这在这种情况下也是有效的。

那么如果你的示例列表中有一个字典,其包含 'd0_sort': 3,那么它的排序会高于任何包含 'd1_sort' 的字典:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
...     print d  
{'d0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}
1

我还是会使用一个叫做 itemgetter 的东西,因为它更快,而且你只需要创建一次,以后每次都可以用:

from operator import itemgetter

def make_getter(nr):
    keys = ('d%d_sort' % (n + 1) for n in xrange(nr))
    return itemgetter(*keys)

example_list.sort(key=make_getter(2))

创建这个 itemgetter 是需要时间的。如果你需要在多个列表上使用它,而它的内容又是一样的,那就把它存起来,比如说 get_two = make_getter(2),然后在需要的时候用 get_two 作为 key 函数。

撰写回答