在python3中使用带有beautifulsoup的子字符串查找html标记

2024-04-25 22:20:45 发布

男 | 程序猿一只，喜欢编程写python代码。

代码如下：

url ='http://lampspw.wallonie.be/dgo4/site_ipic/index.php/fiche/index?sortCol=2&sortDir=asc&start=0&nbElemPage=10&filtre=&codeInt=62121-INV-0018-02'
soup = BeautifulSoup(page.content, 'html.parser')
t = soup.find_all("div", attrs={'class':'panel-heading'})
lst = [x.text for x in t]

我获得：

['\xa0Filtres complémentaires',
 '\xa0Recherche dans les notices',
 'Libellé(s)\xa0',
 'Illustration(s)',
 'Localisation',...]

如果我直接在soup中查找一个特定的标记（包含在该列表中）并使用子字符串：

In [290]: soup.find_all("div", string=re.compile('Locali'))
Out[291]: [<div class="panel-heading">Localisation</div>]

我找到了我想要的前一个标签。但如果我这么做了：

In :soup.find_all("div", string=re.compile('Libe'))
Out: []

有人能解释一下这个问题吗？我猜它在html代码中，但我没有找到它。。。你知道吗

Tags：代码 in div re string index html all

1条回答

网友

1楼 · 发布于 2024-04-25 22:20:45

感谢kcorlidy：汤。全部找到（字符串）=重新编译（'Libe'））将得到结果

在python3中使用带有beautifulsoup的子字符串查找html标记

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python3中使用带有beautifulsoup的子字符串查找html标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >