美化群体，分类搜索

2条回答

网友

1楼 · 编辑于 2024-05-13 17:58:43

如果wikitable出现在另一个CSS类之后（如class="something wikitable other"），那么模式匹配也将失败，因此，如果希望其class属性包含类wikitable的所有表都需要接受更多可能性的模式：

html = '''<html><table class="sortable wikitable other">blah</table>
<table class="wikitable sortable">blah</table>
<table class="wikitable"><blah></table></html>'''

tree = BeautifulSoup(html)
for node in tree.findAll(attrs={'class': re.compile(r".*\bwikitable\b.*")}):
    print node

结果：

<table class="sortable wikitable other">blah</table>
<table class="wikitable sortable">blah</table>
<table class="wikitable"><blah></blah></table>

为了便于记录，我不使用BeautifulSoup，而是喜欢使用lxml，正如其他人提到的那样。

网友

2楼 · 编辑于 2024-05-13 17:58:43

使lxml比BeautifulSoup更好的一点是支持适当的CSS类选择（如果您想使用它们，甚至支持full css selectors）

import lxml.html

html = """<html>
<body>
<div class="bread butter"></div>
<div class="bread"></div>
</body>
</html>"""

tree = lxml.html.fromstring(html)

elements = tree.find_class("bread")

for element in elements:
    print lxml.html.tostring(element)

给出：

<div class="bread butter"></div>
<div class="bread"></div>

相关问题更多 >

编程相关推荐

热门问题

热门文章

美化群体，分类搜索

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >