在pyparsing中抑制空字符串

data = '{"url", {action1, action2}, "", class1 + class2}' quotedString.setParseAction(removeQuotes) parser = nestedExpr(opener="{", closer="}", ignoreExpr=(quotedString | Suppress(",") | Suppress("+"))) print(parser.parseString(data, parseAll=True)[0])

1条回答

网友

1楼 · 发布于 2024-06-09 22:19:53

nestedExpr实际上只是一种“作弊”表达式，用于轻松跳过括号、大括号等中的嵌套列表。要真正解析内容，或对内容进行有意义的处理，更清楚的是定义一个实际的递归表达式（尽管这是一些额外的工作）

不过，我并没有在某种程度上欺骗自己。我将word_sum定义为带“+”分隔符的分隔词列表。这会抑制“+”符号。然后，我再次将delimitedList用于“，”分隔的部分，这再次只是返回列表项的列表，并删除了定界“，”。有了这些快捷方式，递归语法看起来非常简短。请参见下面注释代码中的注释

（对于您关于通过列出多个字符来抑制多个字符的问题，您可以使用以下方法之一来实现：pp.Suppress(pp.oneOf('+ - , *'))或pp.oneOf("+ - , *").suppress()，以适合您的口味为准

import pyparsing as pp

# delimitedList will suppress the delimiters, and just return the list elements
# delimitedList also will match just a single word
word = pp.Word(pp.alphas, pp.alphanums)
word_sum = pp.delimitedList(word, delim="+")

# expression for items that can be found inside the braces, in a list delimited by commas
# - define an explicit suppressor for ""
# - match QuotedStrings
# - match word_sums
item = pp.Literal('""').suppress() | pp.QuotedString('"') | word_sum

# define a Forward for the recursive expression
brace_expr = pp.Forward()

# define the contents of the recursive expression, which can include a reference to itself
# (use '<<=', not '=' for this definition)
LBRACE, RBRACE = map(pp.Suppress, "{}")
brace_expr <<=pp.Group(LBRACE + pp.delimitedList(item | brace_expr, delim=",") + RBRACE)

# try it out!
text = data = '{"url", {action1, action2}, "", class1 + class2}'
print(brace_expr.parseString(text)[0])

# prints
# ['url', ['action1', 'action2'], 'class1', 'class2']

相关问题更多 >

编程相关推荐

热门问题

热门文章