计算字符串中字符出现次数的最佳方法

5 投票
6 回答
6113 浏览
提问于 2025-04-17 10:45

你好,我正在尝试把这些Python代码写成一行,但由于代码对字典的修改,我遇到了一些错误。

for i in range(len(string)):
    if string[i] in dict:
        dict[string[i]] += 1

我认为一般的语法是

abc = [i for i in len(x) if x[i] in array]

有没有人能告诉我,这样做是怎么回事,因为我在字典的值上加了1。

谢谢!

6 个回答

7

Python 2.7及以上版本的替代方案:

from collections import Counter

abc = Counter('asdfdffa')
print abc
print abc['a']

输出结果:

Counter({'f': 3, 'a': 2, 'd': 2, 's': 1})
2
7

这是一个关于使用collections模块的工作:


选项 1.- collections.defaultdict

>>> from collections import defaultdict
>>> mydict = defaultdict(int)

这样你的循环就变成了:

>>> for mychar in mystring: mydict[mychar] += 1

选项 2.- collections.Counter(来自Felix的评论):

这是一个更适合这个特定情况的替代方案,依然来自同一个collections模块:

>>> from collections import Counter

这样你只需要(!!!):

>>> mydict = Counter(mystring)

Counter只在Python 2.7及以上版本可用。所以如果你使用的是Python 2.7以下的版本,还是应该使用defaultdict。

7

你想做的事情可以通过使用 dict、一个 生成器表达式str.count() 来实现:

abc = dict((c, string.count(c)) for c in string)

另一种方法是使用 set(string) (这是下面 soulcheck 的评论提到的)

abc = dict((c, string.count(c)) for c in set(string))

时间测试

看到下面的评论后,我对这个和其他答案做了一些测试。 (使用 python-3.2)

测试函数:

@time_me
def test_dict(string, iterations):
    """dict((c, string.count(c)) for c in string)"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in string)

@time_me
def test_set(string, iterations):
    """dict((c, string.count(c)) for c in set(string))"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in set(string))

@time_me
def test_counter(string, iterations):
    """Counter(string)"""
    for i in range(iterations):
        Counter(string)

@time_me
def test_for(string, iterations, d):
    """for loop from cha0site"""
    for i in range(iterations):
        for c in string:
            if c in d:
                d[c] += 1

@time_me
def test_default_dict(string, iterations):
    """defaultdict from joaquin"""
    for i in range(iterations):
        mydict = defaultdict(int)
        for mychar in string:
            mydict[mychar] += 1

测试执行:

d_ini = dict((c, 0) for c in string.ascii_letters)
words = ['hand', 'marvelous', 'supercalifragilisticexpialidocious']

for word in words:
    print('-- {} --'.format(word))
    test_dict(word, 100000)
    test_set(word, 100000)
    test_counter(word, 100000)
    test_for(word, 100000, d_ini)
    test_default_dict(word, 100000)
    print()

print('-- {} --'.format('Pride and Prejudcie - Chapter 3 '))

test_dict(ch, 1000)
test_set(ch, 1000)
test_counter(ch, 1000)
test_for(ch, 1000, d_ini)
test_default_dict(ch, 1000)

测试结果:

-- hand --
389.091 ms -  dict((c, string.count(c)) for c in string)
438.000 ms -  dict((c, string.count(c)) for c in set(string))
867.069 ms -  Counter(string)
100.204 ms -  for loop from cha0site
241.070 ms -  defaultdict from joaquin

-- marvelous --
654.826 ms -  dict((c, string.count(c)) for c in string)
729.153 ms -  dict((c, string.count(c)) for c in set(string))
1253.767 ms -  Counter(string)
201.406 ms -  for loop from cha0site
460.014 ms -  defaultdict from joaquin

-- supercalifragilisticexpialidocious --
1900.594 ms -  dict((c, string.count(c)) for c in string)
1104.942 ms -  dict((c, string.count(c)) for c in set(string))
2513.745 ms -  Counter(string)
703.506 ms -  for loop from cha0site
935.503 ms -  defaultdict from joaquin

# !!!: Do not compare this last result with the others because is timed
#      with 1000 iterations instead of 100000
-- Pride and Prejudcie - Chapter 3  --
155315.108 ms -  dict((c, string.count(c)) for c in string)
982.582 ms -  dict((c, string.count(c)) for c in set(string))
4371.579 ms -  Counter(string)
1609.623 ms -  for loop from cha0site
1300.643 ms -  defaultdict from joaquin

撰写回答