使用正则表达式查找Python中两个已定义标记之间出现的所有多行字符串

file = 'script.py' with open(file, 'r') as f: text = f.read() text = read_python_script(file) code_blocks = re.compile(r'(?s)(?<=#c\n)(.*\n)(?=#-c)') desc_blocks = re.compile(r'(?s)(?<=#d\n)(.*\n)(?=#-d)') code = re.findall(code_blocks, text) desc = re.findall(desc_blocks, text)

# -*- coding: utf-8 -*- """ Title: blablabla Author: Mathieu Description: Report the some measurements """ import time import uuid from database.model import TestLimit, TestResult from util import log #d """ blablabla some description which might be multiline """ #-d # Constants #c CONSTANT_1 = 10 # Unit CONSTANT_2 = 2 # Unit #-c class Foo(Fp): def __init__(self, #some parameters): # some init def _run(self, #some parameters): #%% Section Title 1 #%%% Sub section Title 1 #c code I would like to catch can be multiple liens #-c with something.open(): #%%% Sub section title 2 #d """some description I want to catch""" #-d #c some more code I want to catch #-c

1条回答

网友

1楼 · 发布于 2024-05-13 03:41:37

你可以用

re.findall(r'(?s)#c\n(.*?)\n[^\S\r\n]*#-c', text)

见regex demo

这里有两件事：

惰性匹配.*?模式
[^\S\r\n]*匹配#-c之前必须使用的任何0个或多个水平空白，因为符号前可能有一些缩进

详细信息

(?s)-inlinere.DOTALL修饰符
#c-一个文本字符串
\n-换行符
(.*?)-捕获组1：任何零个或多个字符，尽可能少
\n-换行符
[^\S\r\n]*-0个或更多水平空白字符
#-c-一个文本字符串

相关问题更多 >

编程相关推荐

热门问题

热门文章