HTML比我知道的要复杂得多。下面是提取我需要收集的数据的代码
title = soup.find_all('tr',attrs={'class':'cM'})
first = title[0]
first
我可以通过以下代码获得电影的标题:
#movie title
first.find(attrs={'class':'cI'}).text
但是,下面的数据(年份,评级,利率,脚趾)是我有困难收集,我不知道什么样的类或参考,我需要打电话得到它
<td>2017</td><td>13+</td><td>7.9</td><td>92%</td>
以下是HTML:
<tr class="cM c2" itemprop="itemListElement" itemscope="" itemtype="//schema.org/ListItem"><td class="cH"><a href="/movie/thor-ragnarok-2017"><div class="d9 cN"><picture class="eT"><source srcset="https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-92.webp 92w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-154.webp 154w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-185.webp 185w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-342.webp 342w" type="image/webp"/><source srcset="https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-92.jpg 92w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-154.jpg 154w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-185.jpg 185w,https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-342.jpg 342w" type="image/jpeg"/><img alt="Watch Thor: Ragnarok" class="eU" data-async-image="true" decoding="async" src="https://img.reelgood.com/content/movie/19dcfe68-dc06-43ea-9c44-42255e780898/poster-342.jpg"/></picture></div></a></td><td class="cI"><a href="/movie/thor-ragnarok-2017">Thor: Ragnarok</a><meta content="https://reelgood.com/movie/thor-ragnarok-2017" itemprop="url"><meta content="1" itemprop="position"/></meta></td><td class="cJ"></td><td>2017</td><td>13+</td><td>7.9</td><td>92%</td><td class="cT"><span class="cU"><div class="hp cV"><img alt="netflix" src="https://img.reelgood.com/source-logos/netflix.svg"/></div></span><span class="cX">+ <!-- -->Rent or Buy</span><span><span class="cW"></span></span></td><td class="c0"></td><td class="cO"><div class="cP"><div><span>Want To See</span><img alt="Want To See Icon" src="/assets/f4b0d8c.svg" title="Add movie to watchlist"/></div><div class="cR"><span>Seen</span><img alt="Check Mark Icon" src="/assets/963fd9c.svg" title="Mark movie as seen"/></div></div></td></tr>
您可以使用
findNext("td")
例如:
如果您不想重复代码
输出:
相关问题 更多 >
编程相关推荐