计算多个文件中的不同字符串

def count_string_occurrence(): import os total = 0 x = 0 for file in os.listdir("C:/users/M/Desktop/test"): if file.endswith(".txt"): string = ":)" #define search term f=open(file,encoding="utf8") contents = f.read() f.close() x=contents.count(string) total +=int(x) #calculate occurance of smiley in all files print("Number of " + string + " in all files equals " + str(total)) count_string_occurrence()

2条回答

网友

1楼 · 编辑于 2024-05-16 03:59:25

您可以将搜索字符串设为函数参数，然后使用不同的搜索项多次调用函数。在

def count_string_occurrence(string):
    import os
    total = 0
    x = 0
    for file in os.listdir("C:/users/M/Desktop/test"):
        if file.endswith(".txt"):
            f=open(file,encoding="utf8")
            contents = f.read()
            f.close()
            x=contents.count(string)
            total +=int(x) #calculate occurance of smiley in all files
    return total

smilies = [':)', ':P', '=]']
for s in smilies =
    total = count_string_occurrence(s)
    print("Number of {} in all files equals {}".format( s, total ))

另一种方法是将smilies列表传递给函数，然后在if块内进行迭代。可以将结果存储在dict中，格式为{ ':)': 5, ':P': 4, ... }

网友

2楼 · 编辑于 2024-05-16 03:59:25

关于您的问题：您可以保存一个字典，其中包含每个字符串的计数并返回该值。但如果你保持你目前的结构，那就不好了。在

我的建议是：

显然，你不能在整个行中检查整个行。在
您还可以多次读取相同的文件，而您只能读取一次并检查字符串是否存在。在
您正在检查文件的扩展名，这听起来像是glob的作业。在
您可以使用defaultdict，这样就不必关心计数最初是不是0。在

修改代码：

from collections import defaultdict
import glob

SMILIES = [':)', ':P', '=]']

def count_in_files(string_list):
    results = defaultdict(int)
    for file_name in glob.iglob('*.txt'):
        print(file_name)
        with open(file_name) as input_file:
            for line in input_file:
                for s in string_list:
                    if s in line:
                        results[s] += 1
    return results

print(count_in_files(SMILIES))

最后，使用这种方法，如果您使用Python>；=3.5，您可以将glob调用改为for file_name in glob.iglob('**/*.txt', recursive=True)，这样它将递归地搜索，以备需要。在

这将打印如下内容：

defaultdict(<class 'int'>, {':P': 2, ':)': 1, '=]': 1})

相关问题更多 >

编程相关推荐

热门问题

热门文章