Python - 如何打印文件中的数字、句点和逗号的数量

0 投票
3 回答
911 浏览
提问于 2025-04-18 05:17

def showCounts(fileName):
lineCount = 0
wordCount = 0
numCount = 0
comCount = 0
dotCount = 0

with open(fileName, 'r') as f:
    for line in f:
        words = line.split()
        lineCount += 1
        wordCount += len(words)

        for word in words:
#                ###text = word.translate(string.punctuation)
            exclude = set(string.punctuation)
            text = ""
            text = ''.join(ch for ch in text if ch not in exclude)
            try:
                if int(text) >= 0 or int(text) < 0:
                    numCount += 1
                # elif text == ",":
                    # comCount += 1
                # elif text == ".":
                    # dotCount += 1
            except ValueError:
                pass

print("Line count: " + str(lineCount))
print("Word count: " + str(wordCount))
print("Number count: " + str(numCount))
print("Comma count: " + str(comCount))
print("Dot  count: " + str(dotCount) + "\n")

基本上,它会显示行数和单词数,但我无法让它显示数字、逗号和句点的数量。我让它读取用户输入的文件,然后显示行数和单词数,但出于某种原因,它对数字、逗号和句点的计数都是0。我把出问题的部分注释掉了。如果我去掉逗号,就会出现错误。谢谢大家。

3 个回答

0

对于标点符号,为什么不直接这样做呢:

def showCounts(fileName):
    ...
    ...
    with open(fileName, 'r') as fl:
        f = fl.read()

    comCount = f.count(',')
    dotCount = f.count('.')
0

你可以使用 Counter 这个类来处理这个问题:

from collections import Counter

with open(fileName, 'r') as f:
    data    = f.read().strip()
    lines   = len(data.split('\n'))
    words   = len(data.split())
    counts  = Counter(data)
    numbers = sum(v for (k,v) in counts.items() if k.isdigit())

print("Line count: {}".format(lines))
print("Word count: {}".format(words))
print("Number count: {}".format(numbers))
print("Comma count: {}".format(counts[',']))
print("Dot count: {}".format(counts['.']))
0

这段代码会遍历每一行中的每一个字符,并给它的变量加1:

numCount = 0
dotCount = 0
commaCount = 0
lineCount = 0
wordCount = 0

fileName = 'test.txt'

with open(fileName, 'r') as f:
    for line in f:
        wordCount+=len(line.split())
        lineCount+=1
        for char in line:
            if char.isdigit() == True:
                numCount+=1
            elif char == '.':
                dotCount+=1
            elif char == ',':
                commaCount+=1

print("Number count: " + str(numCount))
print("Comma count: " + str(commaCount))
print("Dot  count: " + str(dotCount))
print("Line count: " + str(lineCount))
print("Word count: " + str(wordCount))

测试一下:

test.txt:

Hello, my name is B.o.b. I like biking, swimming, and running.

I am 125 years old, and  I was 124 years old 1 year ago.

Regards,
B.o.b 

运行结果:

bash-3.2$ python count.py
Number count: 7
Comma count: 5
Dot  count: 7
Line count: 6
Word count: 27
bash-3.2$ 

这里的内容都很清楚,除了 lineCount 这个变量。它为什么是 6 呢?这是因为有换行符。在我的编辑器(nano)中,默认会在任何文件的末尾添加一个换行符。所以你可以想象这个文本文件是这样的:

>>> x = open('test.txt').read()
>>> x
'Hello, my name is B.o.b. I like biking, swimming, and running.\n\nI am 125 years old, and  I was 124 years old 1 year ago.\n\nRegards,\nB.o.b \n'
>>> x.count('\n')
6
>>> 

希望这能帮到你!

撰写回答