如何按频率和字母顺序对列表进行排序？

def count_words(s, n): """Return the n most frequently occuring words in s.""" # TODO: Count the number of occurences of each word in s words = s.split() counts = Counter(words) # TODO: Sort the occurences in descending order (alphabetically in case of ties) # TODO: Return the top n most frequent words. return counts.most_common(n) print count_words("betty bought a bit of butter but the butter was bitter", 3)

3条回答

网友

1楼 · 编辑于 2024-04-25 04:05:05

如Python docs所示

most_common([n])
Return a list of the n most common elements and their counts from the most common to the least. If n is omitted or None, most_common() returns all elements in the counter. Elements with equal counts are ordered arbitrarily:

因此，计数为1的列在表中的顺序不能以任何特定的顺序得到保证，因为底层结构是dict。在

如果你希望你的结果按字母顺序排列，你需要做更多的处理。在

from collections import Counter

c = Counter() #counter generating code

print sorted(c.most_common(), key=lambda i: (-i[1], i[0]))[:3]

这基本上是先通过获取所有结果。^{}，然后sorts them按降序排列第二个参数（单词频率），然后按升序排列第一个参数（单词）。最后取前3个元素的slice作为结果。在

编辑：我意识到我没有正确排序，^{}仅限于升序。在

网友

2楼 · 编辑于 2024-04-25 04:05:05

您可以通过指定键函数来完成此操作

>>> L = [('butter', 2), ('a', 1), ('bitter', 1), ('betty', 1)]
>>> sorted(L, key=lambda x: (-x[1], x[0]))
[('butter', 2), ('a', 1), ('betty', 1), ('bitter', 1)]

由于Python的sort是稳定的，另一种方法是先按字母顺序排序，然后按count进行反向排序

^{pr2}$

网友

3楼 · 编辑于 2024-04-25 04:05:05

首先使用bucket的概念计算所有单词，bucket由字典定义，其中键是单词，值是出现的次数。在

>>> bucket = {}
>>> for word in words:
...     if word in bucket:
...         bucket[word] += 1
...     else:
...         bucket[word] = 1
...
>>> bucket
{'betty': 1, 'bought': 1, 'a': 1, 'bit': 1, 'of': 1, 'butter': 2, 'but': 1, 'the': 1, 'was': 1, 'bitter': 1}

可以使用不带参数的sorted函数按键名排序。在

^{pr2}$

然后按值从高到低排序：

>>> sorted(bucket.items(), key=lambda kv_pair: kv_pair[1], reverse=True)
[('butter', 2), ('betty', 1), ('bought', 1), ('a', 1), ('bit', 1), ('of', 1), ('but', 1), ('the', 1), ('was', 1), ('bitter', 1)]

相关问题更多 >

编程相关推荐

热门问题

热门文章