Python关键字读取程序，在删除标点符号时遇到问题

by_word = {} with open ('novel.txt') as f: for line in f: for word in line.strip().split(): if word[0].isupper(): if word in by_word: by_word[word] += 1 else: by_word[word] = 1 by_count = [] for word in by_word: by_count.append((by_word[word], word)) by_count.sort() by_count.reverse() for count, word in by_count[:100]: print(count, word)

2条回答

网友

1楼 · 编辑于 2024-06-09 14:23:36

希望以下内容能如预期的那样对您有效：

import string
exclude = set(string.punctuation)

by_word = {}
with open ('novel.txt') as f:
  for line in f:
    for word in line.strip().split():
      if word[0].isupper():
        word = ''.join(char for char in word if char not in exclude)
        if word in by_word:
          by_word[word] += 1
        else:
          by_word[word] = 1

by_count = []
for word in by_word:
  by_count.append((by_word[word], word))

by_count.sort()
by_count.reverse()

for count, word in by_count[:100]:
  print(count, word)

它将删除所有的

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

从word

网友

2楼 · 编辑于 2024-06-09 14:23:36

你的代码很好，去掉标点，用正则表达式拆分

for word in line.strip().split():

可以改为

for word in re.split('[,.;]',line.strip()):

其中，[]中的第一个参数包含所有标点符号。它使用re模块https://docs.python.org/2/library/re.html#re.split

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python关键字读取程序，在删除标点符号时遇到问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >