如何获取计数字典，但保持项出现的顺序？

10 投票

4 回答

6727 浏览

提问于 2025-04-18 07:02

举个例子，我需要统计一个单词在列表中出现的次数，但不是按出现频率排序，而是按照单词出现的顺序，也就是插入的顺序。

from collections import Counter

words = ['oranges', 'apples', 'apples', 'bananas', 'kiwis', 'kiwis', 'apples']

c = Counter(words)

print(c)

所以我想要的结果不是：{'apples': 3, 'kiwis': 2, 'bananas': 1, 'oranges': 1}

而是：{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

而且我其实不太需要这个Counter的方法，任何能得到正确结果的方式对我来说都可以。

数据结构插入顺序有序字典单词统计计数字典

4 个回答

在评论中进行了说明

text_list = ['oranges', 'apples', 'apples', 'bananas', 'kiwis', 'kiwis', 'apples']


# create empty dictionary
freq_dict = {}
 
# loop through text and count words
for word in text_list:
    # set the default value to 0
    freq_dict.setdefault(word, 0)
    # increment the value by 1
    freq_dict[word] += 1
 
print(freq_dict )

{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

[Program finished]

回答于 2025-04-18 由 Python大师

分享举报

在Python 3.6中，字典的顺序是按照插入的顺序来的，但这只是实现上的细节。

而在Python 3.7及以上版本中，插入的顺序是有保证的，可以放心使用。想了解更多，可以查看Python 3.6+中的字典是否有序？。

所以，根据你使用的Python版本，你可以直接使用Counter，而不需要像文档中提到的那样创建一个OrderedCounter类。这是因为Counter是dict的一个子类，也就是说issubclass(Counter, dict)会返回True，因此它继承了dict的插入顺序特性。

字符串表示

值得注意的是，Counter的字符串表示方式，在repr方法中定义，并没有更新以反映3.6和3.7的变化，也就是说print(Counter(some_iterable))仍然会按照计数从大到小的顺序返回项目。你可以通过list(Counter(some_iterable))轻松获取插入顺序。

下面是一些示例，展示了这种行为：

x = 'xyyxy'
print(Counter(x))         # Counter({'y': 3, 'x': 2}), i.e. most common first
print(list(Counter(x)))   # ['x', 'y'], i.e. insertion ordered
print(OrderedCounter(x))  # OC(OD([('x', 2), ('y', 3)])), i.e. insertion ordered

例外情况

如果你需要OrderedCounter中额外或重写的方法，那么就不应该使用普通的Counter。特别需要注意的是：

OrderedDict和相应的OrderedCounter提供了popitem和move_to_end方法。
在OrderedCounter对象之间进行相等性测试时，顺序是敏感的，测试方式是list(oc1.items()) == list(oc2.items())。

例如，相等性测试会产生不同的结果：

Counter('xy') == Counter('yx')                # True
OrderedCounter('xy') == OrderedCounter('yx')  # False

回答于 2025-04-18 由 Python大师

分享举报

在Python 3.6及以上版本中，dict会保持你添加元素的顺序。

所以你可以这样做：

words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
counter={}
for w in words: counter[w]=counter.get(w, 0)+1
>>> counter
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

不过，在Python 3.6和3.7中，Counter这个工具并不会显示它保持的添加顺序；相反，__repr__会按照从最常见到最不常见的顺序来排列结果。你可以查看这里了解更多。

但是你可以使用同样的OrderedDict的方法，只需用Python 3.6及以上版本的dict来替代：

from collections import Counter

class OrderedCounter(Counter, dict):
    'Counter that remembers the order elements are first encountered'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, dict(self))

    def __reduce__(self):
        return self.__class__, (dict(self),)

>>> OrderedCounter(words)
OrderedCounter({'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2})

另外，由于Counter是一个保持顺序的dict的子类，所以在Python 3.6及以上版本中，你可以通过调用.items()或者把Counter转回dict来避免使用Counter的__repr__：

>>> c=Counter(words)

这个Counter的展示是按照从最常见的元素到最不常见的顺序排列的，并使用了Counter的__repr__方法：

>>> c
Counter({'apples': 3, 'kiwis': 2, 'oranges': 1, 'bananas': 1})

而这个展示则是按照你添加的顺序，也就是插入顺序：

>>> c.items()
dict_items([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)])

或者，

>>> dict(c)
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}

回答于 2025-04-18 由 Python大师

分享举报

你可以使用这个示例，它结合了collections.Counter和collections.OrderedDict这两个工具：

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
    'Counter that remembers the order elements are first encountered'

    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)

words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
c = OrderedCounter(words)
print(c)
# OrderedCounter(OrderedDict([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)]))

回答于 2025-04-18 由 Python大师

分享举报

如何获取计数字典，但保持项出现的顺序？

4 个回答

撰写回答