追加一个统计单词长度的列表
我正在从一个文本文件中提取单词,去掉每个单词的换行符,然后把这些单词放到一个新的列表里。
现在我需要逐个检查这些单词,找出每个单词的长度,然后在一个计数器中把对应长度的数量加1。也就是说,我一开始会有一个空的计数器:
length_of_words = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
如果我处理的单词列表里有5个7个字母的单词和3个2个字母的单词,那么最后的计数器会变成:
length_of_words = [0,3,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
这实际上就是:
- 计算一个单词的长度,比如说是n
- 在计数器中把对应长度的数量加1,也就是对length_of_words[n-1]加1(因为0号位置是1个字母的单词)
我现在卡在了如何把列表中某个位置的值加1,而不是把1加到列表的末尾。
我现在的代码是这样的:
lines = open ('E:\Python\Assessment\dracula.txt', 'r'). readlines ()
stripped_list = [item.strip() for item in lines]
tally = [] #empty set of lengths
for lengths in range(1,20):
tally.append(0)
print tally #original tally
for i in stripped_list:
length_word = int(len(i))
tally[length_word] = tally[length_word] + 1
print tally
2 个回答
2
collections.Counter 类在这种情况下非常有用:
>>> from collections import Counter
>>> words = 'the quick brown fox jumped over the lazy dog'.split()
>>> Counter(map(len, words))
Counter({3: 4, 4: 2, 5: 2, 6: 1})
你在问题中发的代码本身运行得很好,所以我不太明白你卡在哪里。
顺便说一下,这里有一些小的代码改进建议(更符合Python的风格):
stripped_list = 'the quick brown fox jumped over the lazy dog'.split()
tally = [0] * 20
print tally #original tally
for i in stripped_list:
length_word = len(i)
tally[length_word] += 1
print tally
0
我觉得你代码中的错误在于 tally[length_word]
这一行,你忘了加上 - 1
。
我还对你的代码做了一些修改,让它看起来更像是用 Python 语言写的。
#lines = open ('E:\Python\Assessment\dracula.txt', 'r'). readlines ()
#stripped_list = [item.strip() for item in lines]
with open('/home/facundo/tmp/words.txt') as i:
stripped_list = [x.strip() for x in i.readlines()]
#tally = [] #empty set of lengths
#for lengths in range(1,20):
# tally.append(0)
tally = [0] * 20
print tally #original tally
for i in stripped_list:
#length_word = int(len(i))
word_length = len(i)
#tally[length_word] = tally[length_word] + 1
if word_length > 0:
tally[word_length - 1] += 1
print tally