统计文件中单词音节数的代码

2 投票

1 回答

3625 浏览

提问于 2025-04-16 14:53

我现在有一段代码，可以用来计算CMU发音词典中单词的音节数量。这段代码能统计词典里所有单词的音节数。现在我需要把CMU词典换成我自己的输入文件，并找出文件中每个单词的音节数，然后把结果打印出来。不过，单纯以读取模式打开输入文件并不能解决问题，因为不能把dict()作为文件的属性来用。

下面是代码：

  
from curses.ascii import isdigit 
from nltk.corpus import cmudict 

d = cmudict.dict() # get the CMU Pronouncing Dict

def nsyl(word): 
    """return the max syllable count in the case of multiple pronunciations"""
    return max([len([y for y in x if isdigit(y[-1])]) for x in d[word.lower()]])


w_words = dict([(w, nsyl(w)) for w in d.keys() if w[0] == 'a'or'z'])
worth_abbreviating = [(k,v) for (k,v) in w_words.iteritems() if v > 3]
print worth_abbreviating

有没有人能帮我一下呢？

文件处理代码实现音节统计 cmu发音词典自定义输入

1 个回答

不确定这能否解决所有问题，但：

w_words = dict([(w, nsyl(w)) for w in d.keys() if w[0] == 'a'or'z'])

应该改成

w_words = dict([(w, nsyl(w)) for w in d.keys() if w[0] == 'a' or w[0] == 'z'])

因为

if w[0] == 'a'or'z' 的意思是 if (w[0] == 'a') or ('z')。这里的字符串 'z' 是“真”的，所以这个条件总是成立。

举个例子，

In [36]: 'x' == 'a'or'z'
Out[36]: 'z'

In [37]: 'x' == 'a' or 'x'=='z'
Out[37]: False

回答于 2025-04-16 由 Python大师

分享举报

统计文件中单词音节数的代码

1 个回答

撰写回答