<p>你知道吗测试.txt包含以下句子(如果一只土拨鼠可以的话,一只土拨鼠需要多少木头
(查克·伍德)</p>
<p>这个程序应该读取给定文本文件中的所有单词(直到eof)
并打印出每个单词的计数。这个词应该是
不区分大小写(所有大写),标点符号应为
移除,输出应按
频率。你知道吗</p>
<p>不过,我遇到了一个简单的问题,那就是数台词,而不是字数,帮帮一个兄弟。你知道吗</p>
<blockquote>
<p>Make a translation table for getting rid of non-word characters</p>
</blockquote>
<pre><code>dropChars = "!@#$%ˆ& ()_+-={}[]|\\:;\"’<>,.?/1234567890"
dropDict = dict([(c, '') for c in dropChars])
dropTable = str.maketrans(dropDict)
</code></pre>
<blockquote>
<p>Read a file and build the table.</p>
</blockquote>
<pre><code>f = open("Test.txt")
testList=list()
lineNum = 0
table = {} # dictionary: words -> set of line numbers
for line in f:
testList.append(line)
for line in testList :
lineNum += 1
words = line.upper().translate(dropTable).split()
for word in words:
if word in table:
table[word].add(lineNum)
else:
table[word] = {lineNum}
f.close()
</code></pre>
<blockquote>
<p>Print the table</p>
</blockquote>
<pre><code>for word in sorted(table.keys()):
print(word, end = ": ")
for lineNum in sorted(table[word]):
print(lineNum, end = " ")
print()
</code></pre>