在python中访问正则表达式捕获组

2024-06-01 01:45:33 发布

您现在位置:Python中文网/ 问答频道 /正文

ptx捕获了我想要的大部分内容。因为我不能将许多东西组合到一个正则表达式中),所以我创建了第二个ptx1正则表达式,它还应该捕获以下字符序列: One DepartmentOne foreign DepartmentTwo office

    text_list = [' something\npatternx: text_i_want One Department',' something patternx: text_i_want One foreign Department',' something\n patternx: text_i_want Two office']
    text_list = ' '.join(map(str, text_list))
    ptx = re.compile(r'(\s+something(?:\s+|\\n)*patternx:)(.*)(One\s+foreign)', flags = re.DOTALL)
    ten = ptx.search(text_list)
    try:
        if ten:
            ten = ten.group(2)
        else:
            ten = None
    except:
        pass

我的问题是:为了返回(.*)text_i_want内容,我需要做什么?我有一种直觉,我需要像访问列表一样访问eleven,因为它有太多的捕获组eleven[0].group(1),以便从列表中获取第一个元素并获取第二个组。但这也不起作用

你可以这样想text_list

text_list = ['...something\npatternx: text_i_want One Department',
'...something patternx: text_i_want One foreign Department',
'...something\n patternx: text_i_want Two office']

更新

    text_list = [' something\npatternx: text_i_want One Department',' something patternx: text_i_want One foreign Department',' something\n patternx: text_i_want Two office']
    text_list = ' '.join(map(str, text_list))
    ptx = re.compile(r'\bsomething\s+patternx:(.*?)\b(?:One\s+(?:Department|foreign(?:\s+Department)?)|Two\s+office)\b', flags = re.DOTALL)
    ten = ptx.search(text_list)
    try:
        if ten:
            ten = ten.group(2)
        else:
            ten = None
    except:
        pass

Tags: textregrouponesomethinglistdepartmentoffice
1条回答
网友
1楼 · 发布于 2024-06-01 01:45:33

当考虑到右边的备选方案时,看起来好像你被欺骗了

你需要使用

\bsomething\s+patternx:(.*?)\b(?:One\s+foreign|One\s+Department|One\s+foreign\s+Department|Two\s+office)\b

可以简称为

\bsomething\s+patternx:(.*?)\b(?:One\s+(?:Department|foreign(?:\s+Department)?)|Two\s+office)\b

regex demo详细信息

  • \bsomething\s+patternx:-整个单词something,一个或多个空格,patternx:字符串
  • (.*?)-第1组:任何零个或多个字符,尽可能少
  • \b(?:One\s+(?:Department|foreign(?:\s+Department)?)|Two\s+office)\b-作为整个单词的{}、{}、{}或{}

Python demo

import re
text_list = [' something\npatternx: text_i_want One Department',' something patternx: text_i_want One foreign Department',' something\n patternx: text_i_want Two office']
text_list = ' '.join(map(str, text_list))
rx = r'\bsomething\s+patternx:(.*?)\b(?:One\s+(?:Department|foreign(?:\s+Department)?)|Two\s+office)\b'
print(re.findall(rx, text_list, re.DOTALL))
# => [' text_i_want ', ' text_i_want ', ' text_i_want '] 

相关问题 更多 >