如何用靓汤找到所有评论

2条回答

网友

1楼 · 编辑于 2024-05-23 17:24:20

可以传递函数find_all（）来帮助它检查字符串是否为注释。

例如，我有以下html：

<body>
   <!-- Branding and main navigation -->
   <div class="Branding">The Science &amp; Safety Behind Your Favorite Products</div>
   <div class="l-branding">
      <p>Just a brand</p>
   </div>
   <!-- test comment here -->
   <div class="block_content">
      <a href="https://www.google.com">Google</a>
   </div>
</body>

代码：

from bs4 import BeautifulSoup as BS
from bs4 import Comment
....
soup = BS(html, 'html.parser')
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
for c in comments:
    print(c)
    print("===========")
    c.extract()

结果将是：

Branding and main navigation 
============
test comment here
============

顺便说一句，我认为find_all('Comment')不起作用的原因是（来自BeautifulSoup文档）：

Pass in a value for name and you’ll tell Beautiful Soup to only consider tags with certain names. Text strings will be ignored, as will tags whose names that don’t match.

网友

2楼 · 编辑于 2024-05-23 17:24:20

我需要做两件事：

第一，进口靓汤时

from bs4 import BeautifulSoup, Comment

其次，下面是提取注释的代码

for comments in soup.findAll(text=lambda text:isinstance(text, Comment)):
    comments.extract()

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何用靓汤找到所有评论

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >