<body>
<!-- Branding and main navigation -->
<div class="Branding">The Science & Safety Behind Your Favorite Products</div>
<div class="l-branding">
<p>Just a brand</p>
</div>
<!-- test comment here -->
<div class="block_content">
<a href="https://www.google.com">Google</a>
</div>
</body>
代码:
from bs4 import BeautifulSoup as BS
from bs4 import Comment
....
soup = BS(html, 'html.parser')
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
for c in comments:
print(c)
print("===========")
c.extract()
结果将是:
Branding and main navigation
============
test comment here
============
Pass in a value for name and you’ll tell Beautiful Soup to only consider tags with certain names. Text strings will be ignored, as will tags whose names that don’t match.
可以传递函数find_all()来帮助它检查字符串是否为注释。
例如,我有以下html:
代码:
结果将是:
顺便说一句,我认为
find_all('Comment')
不起作用的原因是(来自BeautifulSoup文档):我需要做两件事:
第一,进口靓汤时
其次,下面是提取注释的代码
相关问题 更多 >
编程相关推荐