在Python wxWidgets TextCtrl中解析HTML
有没有办法,或者有没有什么库可以让我在wx.TextCtrl这个控件里解析HTML代码?
2 个回答
0
wxTextCtrl会显示带有所有标签的HTML内容
<html><body>Hello, world!</body></html>");
要渲染HTML,你需要使用wxHtmlWindow
w = wxHtmlWindow(this)
w.SetPage("<html><body>Hello, world!</body></html>")
1
当然可以,只需要使用 myTextCtrl.GetValue()
这个方法,然后用一些工具来解析这个字符串,比如 BeautifulSoup
、xml.dom.minidom
或者 HTMLParser
等等:
from BeautifulSoup import BeautifulSoup
# lets say this is the text inside the TextCtrl:
# '<html><head><title>Page title</title></head><body><p id="firstpara" align="center">This is paragraph <b>one</b>.<p id="secondpara" align="blah">This is paragraph <b>two</b>.</html>'
#
htmlStr = myTextCtrl.GetValue()
soup = BeautifulSoup(htmlStr)
soup.contents[0].name
# u'html'
soup.contents[0].contents[0].name
# u'head'
head = soup.contents[0].contents[0]
head.parent.name
# u'html'
head.next
# <title>Page title</title>
head.nextSibling.name
# u'body'
head.nextSibling.contents[0]
# <p id="firstpara" align="center">This is paragraph <b>one</b>.</p>
head.nextSibling.contents[0].nextSibling
# <p id="secondpara" align="blah">This is paragraph <b>two</b>.</p>