如果我有一个嵌套的HTML(无序)列表,如下所示:
<<ul style="">
<li class="jstree-last jstree-open" id="wfo-7000000004">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-7000000004">
<ins class="jstree-icon"> </ins>
Acoraceae
</a>
<ul style="">
<li class="jstree-last jstree-open" id="wfo-4000000350">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-4000000350">
<ins class="jstree-icon"> </ins>
Acorus
</a>
<ul style="">
<li class="jstree-open" id="wfo-0000350733">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-0000350733">
<ins class="jstree-icon"> </ins>
Acorus calamus
</a>
<ul style="">
<li class="jstree-leaf" id="wfo-0000350841">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-0000350841">
<ins class="jstree-icon"> </ins>
Acorus calamus var. americanus
</a>
</li>
<li class="jstree-last jstree-leaf" id="wfo-0000350949">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-0000350949">
<ins class="jstree-icon"> </ins>
Acorus calamus var. angustatus
</a>
</li>
</ul>
</li>
<li class="jstree-last jstree-leaf" id="wfo-0000352676">
<ins class="jstree-icon"> </ins>
<a class="" href="taxon/wfo-0000352676">
<ins class="jstree-icon"> </ins>
Acorus gramineus
</a>
</li>
</ul>
</li>
</ul>
</li>
</ul>
如何在Python中用它生成嵌套字典?例如:
{
Acorales: {
Acoraceae: {
Acorus: {
Acoruscalamus: [
Acoruscalamusvar.americanus,
Acoruscalamusvar.angustatus
],
Acorusgramineus
}
}
}
}
我假设像Beautiful Soup和HTML Parser这样的库有这样的功能(在python中使用for循环),但是我还没有弄清楚。谢谢你的帮助!你知道吗
我试着这样做:
def create_dic(soup):
return {li.a.get_text().replace("\xa0", ""): create_dic(li)
for ul in soup('ul', recursive=False)
for li in ul('li', recursive=False)}
然而,输出是这样的(Acorus calamus var.americanus和Acorus calamus var.angustastatus应该在一个列表中,而Acorus gramineus不是一个字典):
{'Acorales': {'Acoraceae': {'Acorus': {'Acorus calamus': {'Acorus calamus var. americanus': {},
'Acorus calamus var. angustatus': {}},
'Acorus gramineus': {}}}}}
目前没有回答
相关问题 更多 >
编程相关推荐