Python 解析括号块
在Python中,解析出匹配括号中的文本块的最佳方法是什么?
"{ { a } { b } { { { c } } } }"
最开始应该返回:
[ "{ a } { b } { { { c } } }" ]
把这个作为输入应该返回:
[ "a", "b", "{ { c } }" ]
这应该返回:
[ "{ c }" ]
[ "c" ]
[]
9 个回答
7
我对Python还不是很熟悉,所以请多多包涵。不过,这里有一个可以正常运行的实现:
def balanced_braces(args):
parts = []
for arg in args:
if '{' not in arg:
continue
chars = []
n = 0
for c in arg:
if c == '{':
if n > 0:
chars.append(c)
n += 1
elif c == '}':
n -= 1
if n > 0:
chars.append(c)
elif n == 0:
parts.append(''.join(chars).lstrip().rstrip())
chars = []
elif n > 0:
chars.append(c)
return parts
t1 = balanced_braces(["{{ a } { b } { { { c } } } }"]);
print t1
t2 = balanced_braces(t1)
print t2
t3 = balanced_braces(t2)
print t3
t4 = balanced_braces(t3)
print t4
输出结果:
['{ a } { b } { { { c } } }']
['a', 'b', '{ { c } }']
['{ c }']
['c']
55
或者这是 pyparsing 的版本:
>>> from pyparsing import nestedExpr
>>> txt = "{ { a } { b } { { { c } } } }"
>>>
>>> nestedExpr('{','}').parseString(txt).asList()
[[['a'], ['b'], [[['c']]]]]
>>>
32
伪代码:
For each string in the array:
Find the first '{'. If there is none, leave that string alone.
Init a counter to 0.
For each character in the string:
If you see a '{', increment the counter.
If you see a '}', decrement the counter.
If the counter reaches 0, break.
Here, if your counter is not 0, you have invalid input (unbalanced brackets)
If it is, then take the string from the first '{' up to the '}' that put the
counter at 0, and that is a new element in your array.