regex在包装图案上拆分

2024-06-01 00:32:31 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我真的很困惑,我们的目标是在一个包装上分开,但是如果它在被包装的东西里,就不是同一个包装了。你知道吗

取以下字符串:

s = 'something{now I am wrapped {I should not cause splitting} I am still wrapped}something else'

结果列表应该是['something','{','now I am wrapped {I should not cause splitting} I am still wrapped','}','something else']

我尝试过的最简单的方法是用findall来看看这是如何工作的,但是由于regex没有内存,所以它不考虑包装,因此只要找到另一个结束括号就结束了。事情是这样的:

>>> s = 'something{now I am wrapped {I should not cause splitting} I am still wrapped}something else'
>>> re.findall(r'{.*?}',s)
['{now I am wrapped {I should not cause splitting}']

有没有办法让我认出来不认出来它是不是内部包装的一部分?你知道吗


Tags: 方法字符串目标列表notamelsenow
3条回答

不确定这是否总是能满足您的需要,但您可以使用partitionrpartition,例如:

In [26]: s_1 = s.partition('{')
In [27]: s_1
Out[27]: 
('something',
 '{',
 'now I am wrapped {I should not cause splitting} I am still wrapped}something else')
In [30]: s_2 = s_1[-1].rpartition('}')
In [31]: s_2
Out[31]: 
('now I am wrapped {I should not cause splitting} I am still wrapped',
 '}',
 'something else')
In [34]: s_out = s_1[0:-1] + s_2
In [35]: s_out
Out[35]: 
('something',
 '{',
 'now I am wrapped {I should not cause splitting} I am still wrapped',
 '}',
 'something else')

基于所有的响应,我决定只编写一个函数,接受字符串和包装器,并使用brute迭代输出列表:

def f(string,wrap1,wrap2):
    wrapped = False
    inner = 0
    count = 0
    holds = ['']
    for i,c in enumerate(string):
        if c == wrap1 and not wrapped:
            count += 2
            wrapped = True
            holds.append(wrap1)
            holds.append('')
        elif c == wrap1 and wrapped:
            inner += 1
            holds[count] += c
        elif c == wrap2 and wrapped and inner > 0:
            inner -= 1
            holds[count] += c
        elif c == wrap2 and wrapped and inner == 0:
            wrapped = False
            count += 2
            holds.append(wrap2)
            holds.append('')
        else:
            holds[count] += c
    return holds

现在这表明它在工作:

>>> s = 'something{now I am wrapped {I should not cause splitting} I am still wrapped}something else'
>>> f(s,'{','}')
['something', '{', 'now I am wrapped {I should not cause splitting} I am still wrapped', '}', 'something else']
s = 'something{now I am wrapped {I should not cause splitting} I am still wrapped}something else'
m = re.search(r'(.*)({)(.*?{.*?}.*?)(})(.*)', s)
print m.groups()

新答案:

s = 'something{now I am wrapped {I should {not cause} splitting} I am still wrapped}something else'
m = re.search(r'([^{]*)({)(.*)(})([^}]*)', s)
print m.groups()

相关问题 更多 >