从给定文件中查找Python字谜

0 投票

1 回答

3276 浏览

提问于 2025-04-17 19:10

我试过了所有能想到的方法来解决这个问题，但还是没搞明白。我甚至不知道该怎么开始。下面是具体的要求……

你的程序需要询问用户一个文件名，这个文件里包含了一些单词。每个单词都单独占一行。
• 对于每个单词，找出它的所有变位词（有些单词可能有多个变位词）。
• 输出：统计有多少个单词的变位词数量是0、1、2等等。输出那些变位词最多的单词列表（如果有多个单词的变位词数量相同，全部都输出）。
• 你需要合理地将程序分解成不同的功能模块。

请记住，我编程才不到一个月，所以尽量把内容简单易懂。提前谢谢你。

用户输入文件处理编程初学者统计分析文本分析字谜功能模块变位词

1 个回答

我想这可能是你的作业。你知道，变位词就是一个单词的字母排列组合。慢慢来：先学会如何计算一个单词的变位词，然后再学怎么处理多个单词。下面的互动示例展示了如何计算一个单词的变位词。你可以在此基础上继续学习。

>>> # Learn how to calculate anagrams of a word
>>> 
>>> import itertools
>>> 
>>> word = 'fun'
>>> 
>>> # First attempt: anagrams are just permutations of all the characters in a word
>>> for permutation in itertools.permutations(word):
...     print permutation
... 
('f', 'u', 'n')
('f', 'n', 'u')
('u', 'f', 'n')
('u', 'n', 'f')
('n', 'f', 'u')
('n', 'u', 'f')
>>> 
>>> # Now, refine the above block to print actual words, instead of tuple
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
fun
fnu
ufn
unf
nfu
nuf
>>> # Note that some words with repeated characters such as 'all'
>>> # has less anagrams count:
>>> word = 'all'
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
all
all
lal
lla
lal
lla
>>> # Note the word 'all' and 'lla' each repeated twice. We need to
>>> # eliminate redundancy. One way is to use set:
>>> word = 'all'
>>> anagrams = set()
>>> for permutation in itertools.permutations(word):
...     anagrams.add(''.join(permutation))
... 
>>> anagrams
set(['lal', 'all', 'lla'])
>>> for anagram in anagrams:
...     print anagram
... 
lal
all
lla
>>> # How many anagrams does the word 'all' have?
>>> # Just count using the len() function:
>>> len(anagrams)
3
>>>

我把上面的示例粘贴在这里，方便你查看。

更新

现在有了Aaron的澄清。最基本的问题是：如何判断两个单词是否是变位词？ 答案是：“当它们的字母数量相同时。” 对我来说，最简单的方法是把所有字母排序，然后进行比较。

def normalize(word):
    word = word.strip().lower() # sanitize it
    word = ''.join(sorted(word))
    return word

# sort_letter('top') ==> 'opt'
# Are 'top' and 'pot' anagrams? They are if their sorted letters are the same:
if normalize('top') == normalize('pot'):
    print 'they are the same'
    # Do something

现在你知道如何比较两个单词了，我们来处理一组单词：

>>> import collections
>>> anagrams = collections.defaultdict(list)
>>> words = ['top', 'fun', 'dog', 'opt', 'god', 'pot']
>>> for word in words:
...     anagrams[normalize(word)].append(word)
... 
>>> anagrams
defaultdict(<type 'list'>, {'opt': ['top', 'opt', 'pot'], 'fnu': ['fun'], 'dgo': ['dog', 'god']})
>>> for k, v in anagrams.iteritems():
...     print k, '-', v
... 
opt - ['top', 'opt', 'pot']
fnu - ['fun']
dgo - ['dog', 'god']

在上面的示例中，我们使用变位词（一个默认字典，和普通字典类似，但有默认值）来存储单词列表。字典的键是排序后的字母。这意味着，anagrams['opt'] ==> ['top', 'opt', 'pot']。通过这个，你可以知道哪个单词有最多的变位词。剩下的应该就简单了。

回答于 2025-04-17 由 Python大师

分享举报

从给定文件中查找Python字谜

1 个回答

更新

撰写回答