如何在部分文本中搜索字符串?

2024-03-29 04:45:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在多个文本文件中搜索以“play”开头的文本行的最后一个字段中出现的文本“1-2”、“2-3”、“3-H”。你知道吗

文本文件的示例如下所示

id,ARI201803290
version,2
info,visteam,COL
info,hometeam,ARI
info,site,PHO01
play,1,0,lemad001,22,CFBBX,HR/78/F
play,1,0,arenn001,20,BBX,S7/L+
play,1,0,stort001,12,SBCFC,K
play,1,0,gonzc001,02,SS>S,K
play,1,1,perad001,32,BTBBCX,S9/G
play,1,1,polla001,02,CSX,S7/L+.1-2
play,1,1,goldp001,32,SBFBBB,W.2-3;1-2
play,1,1,lambj001,00,X,D9/F+.3-H;2-H;1-3
play,1,1,avila001,31,BC*BBX,31/G.3-H;2-3
play,2,0,grayj003,12,CC*BS,K
play,2,1,dysoj001,31,BBCBX,43/G
play,2,1,corbp001,31,CBBBX,43/G
play,4,1,avila001,02,SC1>X,S8/L.1-2

对于上面的文本文件,我希望输出为“4”,因为总共出现了4次“1-2”、“2-3”和“3-H”。你知道吗

到目前为止我得到的代码如下,但是我不知道从哪里开始写一行代码来完成这个功能。你知道吗

import os

input_folder = 'files'  # path of folder containing the multiple text files

# create a list with file names 
data_files = [os.path.join(input_folder, file) for file in     
os.listdir(input_folder)]

# open csv file for writing
csv = open('myoutput.csv', 'w')  
def write_to_csv(line):
    print(line)
    csv.write(line)


j=0 # initialise as 0
count_of_plate_appearances=0 # initialise as 0


for file in data_files:
    with open(file, 'r') as f:  # use context manager to open files
        for line in f:
            lines = f.readlines()
            i=0      
            while i < len(lines):
                temp_array = lines[i].rstrip().split(",")
                if temp_array[0] == "id":
                    j=0
                    count_of_plate_appearances=0
                    game_id = temp_array[1]
                    awayteam = lines[i+2].rstrip().split(",")[2]
                    hometeam = lines[i+3].rstrip().split(",")[2]
                    date = lines[i+5].rstrip().split(",")[2]

                    for j in range(i+46,i+120,1): #only check for plate appearances this when temp_array[0] == "id"
                        temp_array2 = lines[j].rstrip().split(",") #create new array to check for plate apperances
                        if temp_array2[0] == "play" and temp_array2[2] == "1": # plate apperance occurs when these are true

count_of_plate_appearances=count_of_plate_appearances+1
                    #print(count_of_plate_appearances)
                    output_for_csv2=(game_id,date,hometeam, awayteam,str(count_of_plate_appearances))
                    print(output_for_csv2)
                    csv.write(','.join(output_for_csv2) + '\n')                     
                    i=i+1

                else:
                    i=i+1

                    j=0
                    count_of_plate_appearances=0
                #quit()


csv.close() 

你有什么建议吗?提前谢谢!你知道吗


Tags: ofcsvidforplaycountfilesarray
1条回答
网友
1楼 · 发布于 2024-03-29 04:45:51

您可以使用regex,我将您的文本放在一个名为file.txt的文件中。你知道吗

import re
a = ['1-2', '2-3', '3-H'] # What you want to count
find_this = re.compile('|'.join(a)) # Make search string
count = 0
with open('file.txt', 'r') as f:
    for line in f.readlines():
        count += len(find_this.findall(line)) # Each findall returns the list of things found
print(count) # 7

或者一个较短的解决方案:(感谢wjandrea暗示使用发电机)

import re
a = ['1-2', '2-3', '3-H'] # What you want to count
find_this = re.compile('|'.join(a)) # Make search string
with open('file.txt', 'r') as f:
    count = sum(len(find_this.findall(line)) for line in f)
print(count) # 7

相关问题 更多 >