python-markdown htmlStash 占位符未被替换
我现在正在开发一个网页应用,使用的是django框架,并且用python-markdown把markdown格式的文本转换成HTML格式。目前有几个情况是markdown处理不了的,所以我写了一些基本的扩展功能。
"""
Helps make paras for Less framework
@div large-column float-left
# This is an H1
this is a paragraph right here!
and a new one
## Heading 2
and yet another one
--> becomes -->
<div class="large-column float left">
<h1>This is an H1</h1>
<p>this is a paragraph right here!</p>
<p>and a new one</p>
<h2>Heading 2</h2>
<p>and yet another one</p>
</div>
"""
import re
import markdown
# Global vars
LESS_BLOCK_RE = re.compile( \
r'@(?P<tag>div|span)[ ]*(?P<class>[a-zA-z0-9-\ ^\n]+)[ ]*\n(?P<inner>.*)(?=div|span)?',
re.MULTILINE|re.DOTALL
)
class LessFrameworkExtension(markdown.Extension):
def extendMarkdown(self, md, md_globals):
md.registerExtension(self)
md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')
def reset(self):
print 'resetting'
class LessBlockPreprocessor(markdown.preprocessors.Preprocessor):
def __init__(self, md):
markdown.preprocessors.Preprocessor.__init__(self, md)
def getConfig(self, key):
if key in self.config:
return self.config[key][0]
else:
return None
def run(self, lines):
""" Match and store Less Framework Blocks in the HTML Stash """
text = "\n".join(lines)
while 1:
m = LESS_BLOCK_RE.search(text)
if m:
less_tag = m.group('tag')
less_class = m.group('class')
less_inner = m.group('inner')
print less_tag
print less_class
print less_inner
placeholder = self.markdown.htmlStash.store(less_inner, safe=True)
text = '<%s class="%s">\n%s\n</%s>' % (less_tag, less_class, placeholder, less_tag)
else:
break
return text.split("\n")
def _escape(self, txt):
""" basic html escaping """
txt = txt.replace('&', '&')
txt = txt.replace('<', '<')
txt = txt.replace('>', '>')
txt = txt.replace('"', '"')
return txt
def makeExtension(configs):
return LessFrameworkExtension(configs)
上面的扩展功能部分有效,但输出结果是:
<div class="large-column float-left
">
wzxhzdk:0
</div>'
这看起来像是htmlStash存储的占位符。也许我漏掉了对python-markdown的某个调用?查看python-markdown项目中的类似扩展,我发现我的做法是符合规范的。
如果能得到一些帮助,我将非常感激!
示例输入和预期输出
@div large-column float-left
# This is an H1
this is a paragraph right here!
and a new one
## Heading 2
and yet another one
扩展的Markdown --> 变成 --> HTML
<div class="large-column float left">
<h1>This is an H1</h1>
<p>this is a paragraph right here!</p>
<p>and a new one</p>
<h2>Heading 2</h2>
<p>and yet another one</p>
</div>
1 个回答
1
我知道这段话是很久以前的,但如果有其他人(像我一样)遇到这个问题并看到这篇帖子,你需要确保在至少经过normalize_whitespace这一步之后再注册预处理器(因为这一步会去掉一些unicode字符,而htmlstash函数正是用这些字符作为分隔符)。
在这种情况下
md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')
应该是:
md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'>normalize_whitespace')
更多信息请查看这里:https://github.com/Python-Markdown/markdown/issues/222