如何使用pyparsing解析具有多种开闭符的嵌套表达式?

15 投票
2 回答
10060 浏览
提问于 2025-04-16 10:38

我想用pyparsing来解析一个这样的表达式:expr = '(gimme [some {nested [lists]}])',然后得到一个Python列表,格式是:[[['gimme', ['some', ['nested', ['lists']]]]]]。现在我的语法看起来是这样的:

nestedParens = nestedExpr('(', ')')
nestedBrackets = nestedExpr('[', ']')
nestedCurlies = nestedExpr('{', '}')
enclosed = nestedParens | nestedBrackets | nestedCurlies

目前,enclosed.searchString(expr)返回的结果是:[[['gimme', ['some', '{nested', '[lists]}']]]]。这不是我想要的,因为它没有正确识别方括号或大括号,但我不知道为什么。

2 个回答

-3

这个方法应该能解决你的问题。我在你的例子上测试过了:

import re
import ast

def parse(s):
    s = re.sub("[\{\(\[]", '[', s)
    s = re.sub("[\}\)\]]", ']', s)
    answer = ''
    for i,char in enumerate(s):
        if char == '[':
            answer += char + "'"
        elif char == '[':
            answer += "'" + char + "'"
        elif char == ']':
            answer += char
        else:
            answer += char
            if s[i+1] in '[]':
                answer += "', "
    ast.literal_eval("s=%s" %answer)
    return s

如果你需要更多帮助,随时留言哦

28

这里有一个使用pyparsing的解决方案,它通过自我修改的语法来动态匹配正确的闭合括号字符。

from pyparsing import *

data = '(gimme [some {nested, nested [lists]}])'

opening = oneOf("( { [")
nonBracePrintables = ''.join(c for c in printables if c not in '(){}[]')
closingFor = dict(zip("({[",")}]"))
closing = Forward()
# initialize closing with an expression
closing << NoMatch()
closingStack = []
def pushClosing(t):
    closingStack.append(closing.expr)
    closing << Literal( closingFor[t[0]] )
def popClosing():
    closing << closingStack.pop()
opening.setParseAction(pushClosing)
closing.setParseAction(popClosing)

matchedNesting = nestedExpr( opening, closing, Word(alphas) | Word(nonBracePrintables) )

print matchedNesting.parseString(data).asList()

输出结果是:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

更新:我发布了上面的解决方案,因为其实我在一年前就写过这个作为实验。我刚仔细看了你的原始帖子,这让我想到了operatorPrecedence方法创建的递归类型定义,所以我重新做了这个解决方案,采用了你原来的方法——这样更简单易懂!(不过可能在特定输入数据下会有左递归的问题,没经过彻底测试):

from pyparsing import *

enclosed = Forward()
nestedParens = nestedExpr('(', ')', content=enclosed) 
nestedBrackets = nestedExpr('[', ']', content=enclosed) 
nestedCurlies = nestedExpr('{', '}', content=enclosed) 
enclosed << (Word(alphas) | ',' | nestedParens | nestedBrackets | nestedCurlies)


data = '(gimme [some {nested, nested [lists]}])' 

print enclosed.parseString(data).asList()

结果是:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

编辑: 这是更新后的解析器的图示,使用了pyparsing 3.0中新增的铁路图支持。 铁路图

撰写回答