在列表中找到最受欢迎的词汇

13 投票

3 回答

19715 浏览

提问于 2025-04-16 13:18

我有一个单词列表：

words = ['all', 'awesome', 'all', 'yeah', 'bye', 'all', 'yeah']

我想得到一个元组的列表：

[(3, 'all'), (2, 'yeah'), (1, 'bye'), (1, 'awesome')]

每个元组是...

(number_of_occurrences, word)

这个列表应该按照出现的次数进行排序。

我到目前为止做的事情是：

def popularWords(words):
    dic = {}
    for word in words:
        dic.setdefault(word, 0)
        dic[word] += 1
    wordsList = [(dic.get(w), w) for w in dic]
    wordsList.sort(reverse = True)
    return wordsList

我的问题是...

这样做算不算是Python风格，优雅又高效呢？你能做得更好吗？谢谢！

数据处理算法优化频率分析元组排序词汇统计

3 个回答

这样写算不算Python风格，优雅又高效呢？

我觉得看起来不错...

你能做得更好吗？

什么叫“更好”？如果代码易懂又高效，这不就够了吗？

可以看看defaultdict，用这个来替代setdefault可能会更好。

回答于 2025-04-16 由 Python大师

分享举报

你需要的就是 defaultdict 这个集合：

from collections import defaultdict

D = defaultdict(int)
for word in words:
    D[word] += 1

这样你就得到了一个字典，字典的键是单词，值是这些单词出现的频率。要得到 (频率, 单词) 这样的元组，你可以这样做：

tuples = [(freq, word) for word,freq in D.iteritems()]

如果你使用的是 Python 2.7 或者 3.1 以上的版本，你可以用一个内置的 Counter 类来完成第一步：

from collections import Counter
D = Counter(words)

回答于 2025-04-16 由 Python大师

分享举报

你可以使用counter来实现这个功能。

import collections
words = ['all', 'awesome', 'all', 'yeah', 'bye', 'all', 'yeah']
counter = collections.Counter(words)
print(counter.most_common())
>>> [('all', 3), ('yeah', 2), ('bye', 1), ('awesome', 1)]

它会返回一个列顺序反转的元组。

根据评论：collections.counter在Python 2.7和3.1及以上版本中可用。如果你使用的是更低版本，可以参考这个counter的实现方法。

回答于 2025-04-16 由 Python大师

分享举报

在列表中找到最受欢迎的词汇

3 个回答

撰写回答