在python中找到第n个最常见的单词并计数

from collections import Counter def nth_most(str_in, n): split_it = str_in.split(" ") array = [] for word, count in Counter(split_it).most_common(n): list = [word, count] array.append(count) array.sort() if len(array) - n <= len(array) - 1: c = array[len(array) - n] return [word, c]

Traceback (most recent call last): File "/grade/run/test.py", line 20, in test_negative self.assertEqual(nth_most('awe Awe AWE BLUE BLUE call', 1), ['awe', 3]) AssertionError: Lists differ: ['BLUE', 2] != ['awe', 3] First differing element 0: 'BLUE' 'awe'

3条回答

网友

1楼 · 编辑于 2024-06-16 15:09:31

def nth_common(lowered_words, check):
    m = []
    for i in lowered_words:
        m.append((i, lowered_words.count(i)))
    for i in set(m):
        # print(i)
        if i[1] == check: # check if the first index value (occurrance) of tuple == check
            print(i, "found")
    del m[:] # deleting list for using it again


words = ['apple', 'apple', 'apple', 'blue', 'BLue', 'call', 'cAlL']
lowered_words = [x.lower() for x in words]   # ignoring the uppercase
check = 2   # the check

nth_common(lowered_words, check)

输出：

^{pr2}$

网友

2楼 · 编辑于 2024-06-16 15:09:31

既然您使用的是Counter，请明智地使用它：

import collections

def nth_most(str_in, n):
    c = sorted(collections.Counter(w.lower() for w in str_in.split()).items(),key = lambda x:x[1])
    return(list(c[-n])) # convert to list as it seems to be the expected output

print(nth_most("apple apple apple blue BlUe call",2))

建立词频字典，根据值（元组的第二个元素）对项目进行排序，并选择最后第n个元素。在

这将打印['blue', 2]。在

如果在第一位或第二位有两个频率相同的单词（并列）怎么办？这个解决方案行不通。相反，对出现的次数进行排序，提取第n个最常见的出现，然后再次运行counter dict来提取匹配项。在

^{pr2}$

这次打印：

[['call', 2], ['blue', 2]]

网友

3楼 · 编辑于 2024-06-16 15:09:31

计数器按顺序返回大多数公用元素，以便您可以执行以下操作：

list(Counter(str_in.lower().split()).most_common(n)[-1]) # n is nth most common word

相关问题更多 >

编程相关推荐

热门问题

热门文章