用Python读取文件并检查文件中是否包含特定字符串

0 投票
3 回答
653 浏览
提问于 2025-04-16 02:31

我有一个文件,格式如下:

Summary;None;Description;Emails\nDarlene\nGregory Murphy\nDr. Ingram\n;DateStart;20100615T111500;DateEnd;20100615T121500;Time;20100805T084547Z
Summary;Presence tech in smart energy management;Description;;DateStart;20100628T130000;DateEnd;20100628T133000;Time;20100628T055408Z
Summary;meeting;Description;None;DateStart;20100629T110000;DateEnd;20100629T120000;Time;20100805T084547Z
Summary;meeting;Description;None;DateStart;20100630T090000;DateEnd;20100630T100000;Time;20100805T084547Z
Summary;Balaji Viswanath: Meeting;Description;None;DateStart;20100712T140000;DateEnd;20100712T143000;Time;20100805T084547Z
Summary;Government Industry Training:  How Smart is Your City - The Smarter City Assessment Tool\nUS Call-In Information:  1-866-803-2143\,     International Number:  1-210-795-1098\,     International Toll-free Numbers:  See below\,     Passcode:  6785765\nPresentation Link - Copy and paste URL into web browser:  http://w3.tap.ibm.com/medialibrary/media_view?id=87408;Description;International Toll-free Numbers link - Copy and paste this URL into your web browser:\n\nhttps://w3-03.sso.ibm.com/sales/support/ShowDoc.wss?docid=NS010BBUN-7P4TZU&infotype=SK&infosubtype=N0&node=clientset\,IA%7Cindustries\,Y&ftext=&sort=date&showDetails=false&hitsize=25&offset=0&campaign=#International_Call-in_Numbers;DateStart;20100811T203000;DateEnd;20100811T213000;Time;20100805T084547Z

现在我需要创建一个函数,功能如下:

这个函数的参数会指定要读取哪一行,假设我已经把这一行用分号分开了(line.split(;))。

  1. 检查一下在line[1]中是否有“meeting”(会议)或者“call in number”(拨入号码),再看看在line[2]中是否也有这两个词中的任意一个。如果这两个条件中有一个成立,函数就应该返回“call-in meeting”(拨入会议)。如果都不成立,就返回“None Inferred”(没有推断出任何信息)。

提前谢谢你!

3 个回答

1

vlad003说得对:如果你的行中有换行符,它们就会变成新的一行!在这种情况下,我建议你改为用“Summary”来分割:

import itertools

def chunks( filePath ):
    "Since you have newline characters in each section,\
    you can't read each line in turn. This function reads\
    lines of the file and splits them into chunks, restarting\
    each time 'Summary' starts a line."
    with open( filePath ) as theFile:
        chunk = [ ]
        for line in theFile:
            if line.startswith( "Summary" ):
                if chunk: yield chunk
                chunk = [ line ]
            else:
                chunk.append( line )
        yield chunk

def nth(iterable, n, default=None):
    "Gets the nth element of an iterator."
    return next(islice(iterable, n, None), default)

def getStatus( chunkNum ):
    "Get the nth chunk of the file, split it by ";", and return the result."
    chunk = nth( chunks, chunkNum, "" ).split( ";" )
    if not chunk[ 0 ]:
        raise SomeError # could not get the right chunk
    if "meeting" in chunk[ 1 ].lower() or "call in number" in chunk[ 1 ].lower():
        return "call-in meeting"
    else:
        return "None Inferred"

需要注意的是,如果你打算读取文件的所有部分,这样做其实有点傻,因为每次查询都会打开文件并逐行读取一次。如果你打算经常这样做,最好把文件解析成更好的数据格式(比如一个状态数组)。这样只需要读取文件一次,就能更方便地查找数据。

1

这是对ghostdog74回答的一个补充:

def finder(line):
    '''Takes line number as argument. First line is number 0.'''
    with open('/home/vlad/Desktop/file.txt') as f:
        lines = f.read().split('Summary')[1:]
        searchLine = lines[line]
        if 'meeting' in searchLine.lower() or 'call in number' in searchLine.lower():
            return 'call-in meeting'
        else:
            return 'None Inferred'

我不太明白你说的 line[1]line[2] 是什么意思,所以我只能做到这个程度。

编辑:我解决了 \n 的问题。我想既然你在找 meetingcall in number,那么就不需要 Summary 了,所以我用它来分割行。

1

使用'in'这个操作符来检查是否有匹配的项。

for line in open("file"):
    if "string" in line :
        ....

撰写回答