Python重复词

2024-05-19 03:06:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个问题,我必须在Python(v3.4.1)中计算重复的单词,并将它们放入一个句子中。我用了计数器,但我不知道如何按以下顺序得到输出。输入是:

mysentence = As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality

我把它列成一个单子,然后分类

结果应该是

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

到目前为止我已经到了这个地步

x=input ('Enter your sentence :')
y=x.split()
y.sort()
for y in sorted(y):
    print (y)

Tags: thetotimeisasnotrepeatedare
3条回答

我可以看到你在sort中的位置,因为你可以可靠地知道什么时候你碰到了一个新单词,并跟踪每个单词的计数。但是,您真正想做的是使用哈希(dictionary)来跟踪计数,因为dictionary键是唯一的。例如:

words = sentence.split()
counts = {}
for word in words:
    if word not in counts:
        counts[word] = 0
    counts[word] += 1

现在这将给你一个字典,其中的关键是这个词和价值是它出现的次数。您可以使用collections.defaultdict(int)来做一些事情,这样您就可以添加值:

counts = collections.defaultdict(int)
for word in words:
    counts[word] += 1

但还有比这更好的。。。collections.Counter它会把你的单词列表变成一个包含计数的字典(实际上是字典的扩展)。

counts = collections.Counter(words)

从那里你想要的单词列表与他们的计数排序,以便你可以打印他们。items()将给您一个元组列表,并且sorted将按每个元组的第一项(本例中的单词)排序(默认情况下)。。。这正是你想要的。

import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split()
word_counts = collections.Counter(words)
for word, count in sorted(word_counts.items()):
    print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else ""))

输出

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

要按排序顺序打印字符串中的重复字:

from itertools import groupby 

mysentence = ("As far as the laws of mathematics refer to reality "
              "they are not certain as far as they are certain "
              "they do not refer to reality")
words = mysentence.split() # get a list of whitespace-separated words
for word, duplicates in groupby(sorted(words)): # sort and group duplicates
    count = len(list(duplicates)) # count how many times the word occurs
    print('"{word}" is repeated {count} time{s}'.format(
            word=word, count=count,  s='s'*(count > 1)))

Output

"As" is repeated 1 time
"are" is repeated 2 times
"as" is repeated 3 times
"certain" is repeated 2 times
"do" is repeated 1 time
"far" is repeated 2 times
"laws" is repeated 1 time
"mathematics" is repeated 1 time
"not" is repeated 2 times
"of" is repeated 1 time
"reality" is repeated 2 times
"refer" is repeated 2 times
"the" is repeated 1 time
"they" is repeated 3 times
"to" is repeated 2 times

嘿,我已经在Python2.7(mac)上试过了,因为我有那个版本,所以试着掌握逻辑

from collections import Counter

mysentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""

mysentence = dict(Counter(mysentence.split()))
for i in sorted(mysentence.keys()):
    print ('"'+i+'" is repeated '+str(mysentence[i])+' time.')

我希望这是你正在寻找的,如果没有,然后打电话给我高兴地学习新的东西。

"As" is repeated 1 time.
"are" is repeated 2 time.
"as" is repeated 3 time.
"certain" is repeated 2 time.
"do" is repeated 1 time.
"far" is repeated 2 time.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 time.
"of" is repeated 1 time.
"reality" is repeated 2 time.
"refer" is repeated 2 time.
"the" is repeated 1 time.
"they" is repeated 3 time.
"to" is repeated 2 time.

相关问题 更多 >

    热门问题