如何使用beautifulsoup在另一个div元素中div元素?

2024-06-02 08:10:53 发布

您现在位置:Python中文网/ 问答频道 /正文

这是示例HTML代码:

   <div class="cb-col cb-col-25 cb-mtch-blk"><a class="cb-font-12" href="/live-cricket-scores/16947/ind-vs-ban-only-test-bangladesh-tour-of-india-2017" target="_self" title="India v Bangladesh - Only Test">
<div class="cb-hmscg-bat-txt cb-ovr-flo ">
<div class="cb-ovr-flo cb-hmscg-tm-nm">BAN</div>
<div class="cb-ovr-flo" style="display:inline-block; width:140px">322/6 (104.0 Ovs)</div>
</div>

我想从上面解析的html中提取像BAN322/6(104.0ovs)这样的文本。我这样做-

soup = BeautifulSoup(html)
div_class = soup.findAll('div',class_='cb-col cb-col-25 cb-mtch-blk')
for each in div_class:
    #I want to get those texts from variable 'each'

我该怎么做


Tags: div示例htmlcolclasseachcbflo
2条回答

可以将some css selectors与BeautifulSoup4一起使用:

>>> from bs4 import BeautifulSoup
>>> html = ...  # the html provided in the question
>>> soup = BeautifulSoup(html, 'lxml')
>>> name, size = soup.select('div.cb-hmscg-bat-txt.cb-ovr-flo div')
>>> name.text
u'BAN'
>>> size.text
u'322/6 (104.0 Ovs)'

each表示您提供的HTML代码,您应该转到下一个div标记,并获取所有文本使用stripped_strings

div_class = soup.findAll('div',class_='cb-col cb-col-25 cb-mtch-blk')
for each in div_class:
    name, size = each.div.stripped_strings
    print(name, size)

输出:

BAN 322/6 (104.0 Ovs)

相关问题 更多 >