基于不同属性的动态列表排序
我看到过一些关于如何根据固定数量来排序列表的解决方案:根据多个属性排序列表?
其中有一个很不错的排序方法:
s = sorted(s, key = lambda x: (x[1], x[2]))还有一个使用itemgetter的例子。
不过,我这边的属性数量是变化的,比如有两个属性的情况:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x': 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x': 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y': 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z': 'd2_sort': 38},
etc.
]
但它也可能是1个、3个或者更多。我不能像这样使用lambda函数或者itemgetter。不过,我在执行时知道维度的数量(虽然每次情况不同)。所以我做了这个(参数设置为2维的例子):
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]
def order_get( a , nr):
result = []
for i in range(1, nr+1):
result.append(a.get('d' + str(i) + '_sort'))
return result
example_list.sort(key = lambda x: order_get(x, 2)) # for this example hard set to 2
In [82]: example_list
Out[82]:
[{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}]
但这样做真的是最好的方法吗?我指的是1) 符合Python的风格,2) 性能方面怎么样?这是一个常见的问题吗?
2 个回答
1
你可以支持任意数量的排序关键字,只要它们的命名模式是可预测的。
假设你有 d[X]_sort
到 d[Y]_sort
,其中 X 和 Y 是整数,并且所有的排序关键字都以 _sort
结尾,排序的关键函数可以这样写:
import re
def arb_kf(d):
li=filter(lambda s: s.endswith('_sort'), d)
rtr=[tuple(map(int, re.findall(r'([0-9]+)', k) + [d[k]])) for k in li]
rtr.sort()
return rtr
以你的字典列表为例:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
... print d
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}
假设 d[X]_sort
中的整数在某些字典里是不同的,而你想要给较小的数字更多的权重;也就是说,d0_sort
的排序权重比没有较小数字的字典要高。
因为 Python 是按元组的元素来排序的,这一点是成立的:
>>> sorted([(1,99), (1,1,1), (0,50), (1,0,99)])
[(0, 50), (1, 0, 99), (1, 1, 1), (1, 99)]
由于关键函数返回的是一个元组列表,这在这种情况下也是有效的。
那么如果你的示例列表中有一个字典,其包含 'd0_sort': 3
,那么它的排序会高于任何包含 'd1_sort'
的字典:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
... print d
{'d0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}
1
我还是会使用一个叫做 itemgetter
的东西,因为它更快,而且你只需要创建一次,以后每次都可以用:
from operator import itemgetter
def make_getter(nr):
keys = ('d%d_sort' % (n + 1) for n in xrange(nr))
return itemgetter(*keys)
example_list.sort(key=make_getter(2))
创建这个 itemgetter
是需要时间的。如果你需要在多个列表上使用它,而它的内容又是一样的,那就把它存起来,比如说 get_two = make_getter(2)
,然后在需要的时候用 get_two
作为 key
函数。