在Python中去除嵌套的bbcode引用?

1 投票
2 回答
580 浏览
提问于 2025-04-17 08:05

我试着搜索这个问题,但只找到了关于PHP的答案。我在Google App Engine上使用Python,想要去掉嵌套的引号。

举个例子:

[quote user2]
[quote user1]Hello[/quote]
World
[/quote]

我想运行一些代码,只获取最外层的引号。

[quote user2]World[/quote]

2 个回答

0

你应该在Python中找一个真正的BBCode解析器。用谷歌搜索一下会找到一些相关的内容,比如这个这个

3

不太确定你是想要只提取引号里的内容,还是想把整个输入中的嵌套引号都去掉。这个pyparsing的例子可以同时做到这两点:

stuff = """
Other stuff
[quote user2] 
[quote user1]Hello[/quote] 
World 
[/quote] 
Other stuff after the stuff
"""

from pyparsing import (Word, printables, originalTextFor, Literal, OneOrMore, 
    ZeroOrMore, Forward, Suppress)

# prototype username
username = Word(printables, excludeChars=']')

# BBCODE quote tags
openQuote = originalTextFor(Literal("[") + "quote" + username + "]")
closeQuote = Literal("[/quote]")

# use negative lookahead to not include BBCODE quote tags in tbe body of the quote
contentWord = ~(openQuote | closeQuote) + (Word(printables,excludeChars='[') | '[')
content = originalTextFor(OneOrMore(contentWord))

# define recursive definition of quote, suppressing any nested quotes
quotes = Forward()
quotes << ( openQuote + ZeroOrMore( Suppress(quotes) | content ) + closeQuote )

# put separate tokens back together
quotes.setParseAction(lambda t : '\n'.join(t))

# quote extractor
for q in quotes.searchString(stuff):
    print q[0]

# nested quote stripper
print quotes.transformString(stuff)

输出结果是:

[quote user2]
World
[/quote]

Other stuff
[quote user2]
World
[/quote] 
Other stuff after the stuff

撰写回答