根据字符串的优先顺序对可迭代对象排序

2 投票

4 回答

1555 浏览

提问于 2025-04-16 15:40

假设我有一个这样的列表/元组：

MyLocation = 'DE'
(    
('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),
('Pencils', '', 19.95, 'PVT', 'IT'),
('Pencils', '', 23.50, 'PRF1', 'US'),
('Pencils', 'Wooden Pencils', 23.50, 'PRF2', 'DE'),
('Pencils', '', 12.50, 'NON', 'DE'))

我想按照以下规则进行两次排序：

1) 把在第 [4] 个元素中匹配字符串 'DE' 的元组放在最上面
这是一个中间步骤，DE 之间的相对顺序不重要。只要所有的 DE 都在最上面就可以了。

(    
('Pencils', '', 12.50, 'NON', 'DE'),
('Pencils', 'Wooden Pencils', 23.50, 'PRF2', 'DE'),
('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),    
('Pencils', '', 23.50, 'PRF1', 'US'),
('Pencils', '', 19.95, 'PVT', 'IT')       
)

2) 然后，根据第 [3] 个元素进行排序，优先顺序应该是 ['PRF1', 'PRF2', 'PRF3']。其他字符串可以放在后面。

我期望的最终排序结果是：

(    
('Pencils', '', 23.50, 'PRF1', 'US'),
('Pencils', 'Wooden Pencils', 23.50, 'PRF2', 'DE'),
('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),    
('Pencils', '', 12.50, 'NON', 'DE'),
('Pencils', '', 19.95, 'PVT', 'IT')       
)

我该如何进行这两次排序？我可以用删除和插入来处理第一次排序，但推荐的方式是什么呢？

tempList = actualList
i = 0
for record in actualList:
    if record[5] == 'DE':
        del tempList[i]
        tempList.insert(0, record)
    i = i + 1
actualList = tempList

我特别困惑的是，第二次排序该怎么进行。请提供第二次排序的代码示例。

可迭代对象数据处理排序算法字符串匹配元组排序自定义排序优先级排序

4 个回答

你只需要一次遍历，配合一个特别的关键函数。

def key(t):
    return (
        dict(PRF1=0, PRF2=1, PRF3=2).get(t[3], 3), # earlier ones get smaller numbers
        int(t[4] != 'DE')) # 0 if DE, 1 otherwise

L.sort(key=key)

这个关键函数会返回一个值，用来比较列表中的元素。它返回的是一个包含两个元素的元组，而元组的比较是根据第一个不同的元素来进行的。所以 (1, 0) < (2, -300) 是因为 1 小于 2。

第一个值是 t[3] 在列表 ['PRF1', 'PRF2', 'PRF3'] 中的索引，如果它不在这个列表里，就用数字 3。这意味着在列表中越靠前的元素，它的值就越小，排序结果也就越靠前。第二个值在评论中已经解释过了。:)

回答于 2025-04-16 由 Python大师

分享举报

这就够了：

PRF = ('PRF1', 'PRF2', 'PRF3')
sorted(records, key=lambda x:(x[4]!='DE', PRF.index(x[3]) if x[3] in PRF else 3))

或者如果你打算多次使用这个，你可能想把关键的功能分开：

k = lambda x: (x[4]!='DE', PRF.index(x[3]) if x[3] in PRF else len(PRF))

然后只需使用

sorted(records, key=k)

在你的例子中：

>>> records = ( ('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),
... ('Pencils', '', 19.95, 'PVT', 'IT'),
... ('Pencils', '', 23.50, 'PRF1', 'US'),
... ('Pencils', 'Wooden Pencils', 23.50, 'PRF2', 'DE'),
... ('Pencils', '', 12.50, 'NON', 'DE') )
>>> import pprint
>>> pprint.pprint(sorted(records, key=k))
[('Pencils', 'Wooden Pencils', 23.5, 'PRF2', 'DE'),
 ('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),
 ('Pencils', '', 12.5, 'NON', 'DE'),
 ('Pencils', '', 23.5, 'PRF1', 'US'),
 ('Pencils', '', 19.95, 'PVT', 'IT')]

回答于 2025-04-16 由 Python大师

分享举报

大致的意思是给每个项目打个分。当你每个项目有多个分数时，可以把这些分数放在一个元组里。

MyLocation = 'DE'
location_score = { MyLocation : 1 }
that_other_field_score = {'PRF1' : 3, 'PRF2' : 2, 'PRF3' : 1}

def score( row ):
    # returns a tuple of item score
    # items not in the score dicts get score 0 for that field
    return ( that_other_field_score.get(row[3], 0),
                  location_score.get(row[4], 0))    

data = [    
('Pencils', 'Artists Pencils', 18.95, 'PVT', 'DE'),
('Pencils', '', 19.95, 'PVT', 'IT'),
('Pencils', '', 23.50, 'PRF1', 'US'),
('Pencils', 'Wooden Pencils', 23.50, 'PRF2', 'DE'),
('Pencils', '', 12.50, 'NON', 'DE')]

# sort data, highest score first
data.sort(key=score, reverse=True)
print data

location_score这个字典可能有点复杂（其实你可以直接写成(1 if row[4]=='DE' else 0)），但这样做的好处是以后可以很容易地扩展。

回答于 2025-04-16 由 Python大师

分享举报

根据字符串的优先顺序对可迭代对象排序

4 个回答

撰写回答