获取BeautifulGroup中id为空的标记的内容

2024-04-24 06:33:06 发布

您现在位置:Python中文网/ 问答频道 /正文

from bs4 import BeautifulSoup

page = """<span id="something">useless</span>
          <span id="">some text</span>
          <span id="different">useless</span>"""
soup = BeautifulSoup(page)

我怎样才能只得到some text?使用soup.find_all('span', {'id': ""})可以找到所有内容。在


Tags: textfromimportidpagesomeallfind
1条回答
网友
1楼 · 发布于 2024-04-24 06:33:06

您有两种选择:

  1. 使用自定义筛选器;传入一个函数,它将被要求为元素返回True或{}:

    soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
    
  2. 使用一个CSS selector,属性完全匹配:

    soup.select('span[id=""]')
    

演示:

>>> from bs4 import BeautifulSoup
>>> page = """<span id="something">useless</span>
...           <span id="">some text</span>
...           <span id="different">useless</span>"""
>>> soup = BeautifulSoup(page)
>>> soup.find_all(lambda e: e.name == 'span' and e.attrs.get('id') == '')
[<span id="">some text</span>]
>>> soup.select('span[id=""]')
[<span id="">some text</span>]

相关问题 更多 >