跨新行获取两个角色之间的所有内容

2024-04-20 09:43:38 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我正在处理的文本示例

6) Jake's Taxi Service is a new entrant to the taxi industry. It has achieved success by staking out a unique position in the industry. How did Jake's Taxi Service mostly likely achieve this position?

A) providing long-distance cab fares at a higher rate than competitors; servicing a larger area than competitors

B) providing long-distance cab fares at a lower rate than competitors; servicing a smaller area than competitors

C) providing long-distance cab fares at a higher rate than competitors; servicing the same area as competitors

D) providing long-distance cab fares at a lower rate than competitors; servicing the same area as competitors

Answer: D

我正在尝试匹配整个问题,包括答案选项。从问题编号到单词答案

这是我当前的正则表达式

((rf'(?<={searchCounter}\) ).*?(?=Answer).*'), re.DOTALL)

SearchCounter只是一个与当前问题相对应的变量,在本例中为6。我认为这个问题与寻找新的路线有关

编辑:完整的源代码

searchCounter = 1

bookDict = {}

with open ('StratMasterKey.txt', 'rt') as myfile:

    for line in myfile:
        question_pattern = re.compile((rf'(?<={searchCounter}\) ).*?(?=Answer).*'), re.DOTALL) 

        result = question_pattern.search(line)
        if result != None: 
            bookDict[searchCounter] = result[0] 
            searchCounter +=1

Tags: theanswerrateasareaatlongdistance
1条回答
网友
1楼 · 发布于 2024-04-20 09:43:38

正则表达式失败的原因是您使用for line in myfile:逐行读取文件,而模式在单个多行字符串中搜索匹配项

contents = myfile.read()替换for line in myfile:,然后使用result = question_pattern.search(contents)获得第一个匹配,或者使用result = question_pattern.findall(contents)获得多个匹配

关于正则表达式的一个注意事项:我没有修复整个模式,因为正如您所提到的,它超出了这个问题的范围,但是由于字符串输入现在是一个多行字符串,您需要删除re.DOTALL,并使用[\s\S]匹配模式中的任何字符,使用.匹配除换行字符以外的任何字符。此外,lookaround构造是冗余的,您可以安全地将(?=Answer)替换为Answer。此外,为了检查是否存在匹配,您可以简单地使用if result:,然后通过访问result.group()获取整个匹配值

完整代码段:

with open ('StratMasterKey.txt', 'rt') as myfile:
    contents = myfile.read()
    question_pattern = re.compile((rf'(?<={searchCounter}\) )[\s\S]*?Answer.*')) 
    result = question_pattern.search(contents)
    if result: 
        print( result.group() )

相关问题 更多 >