我怎样才能勉强得到h3,h4类,而不是h5 string=“Prem League”和div^{cl1}$
我想要h3,h4的文本,在div里面,我需要一个span,在span里面的文本
因此,当h5类字符串是Prem-League时,我希望h4和h3在正上方,并且我需要在h5类字符串=Prem-League的正下方添加fixres\u项的各种元素
<div class="fixres__body" data-url="" data-view="fixture-update" data-controller="fixture-update" data-fn="live-refresh" data-sport="football" data-lite="true" id="widgetLite-6">
<h3 class="fixres__header1">November 2018</h3>
<h4 class="fixres__header2">Saturday 24th November</h4>
<h5 class="fixres__header3">Prem League</h5>
<div class="fixres__item">stuff in here</div>
<h4 class="fixres__header2">Wednesday 28th November</h4>
<h5 class="fixres__header3">UEFA Champ League</h5>
<div class="fixres__item">stuff in here</div>
<h3 class="fixres__header1">December 2018</h3>
<h4 class="fixres__header2">Sunday 2nd December</h4>
<h5 class="fixres__header3">Prem League</h5>
<div class="fixres__item">stuff in here</div>
这是我到目前为止的代码,但这包括了h5以下的数据字符串“欧盟冠军联赛”-我不想要。我只想从低于h5标题“Prem League”的div获得数据。例如,我不希望PSG出现在输出中,因为它来自h5以下的div标题“eufachamp League”
我的代码-
def squad_fixtures():
team_table = ['https://someurl.com/liverpool-fixtures']
for i in team_table:
# team_fixture_urls = [i.replace('-squad', '-fixtures') for i in team_table]
squad_r = requests.get(i)
premier_squad_soup = BeautifulSoup(squad_r.text, 'html.parser')
# print(premier_squad_soup)
premier_fix_body = premier_squad_soup.find('div', {'class': 'fixres__body'})
# print(premier_fix_body)
premier_fix_divs = premier_fix_body.find_all('div', {'class': 'fixres__item'})
for i in premier_fix_divs:
team_home = i.find_all('span', {'class': 'matches__item-col matches__participant matches__participant--side1'})
for i in team_home:
team_home_names = i.find('span', {'class': 'swap-text--bp30'})['title']
team_home_namesall.append(team_home_names)
print(team_home_namesall)
输出
[‘沃特福德’、‘巴黎圣日耳曼’、‘利物浦’、‘伯恩利’、‘B'mouth’、‘利物浦’、‘利物浦’、‘狼队’、‘利物浦’、‘利物浦’、‘曼城’、‘布莱顿’、‘利物浦’、‘利物浦’、‘西汉姆’、‘利物浦’、‘曼联’、‘利物浦’、‘埃弗顿’、‘利物浦’、‘富勒姆’、‘利物浦’、‘索顿’、‘利物浦’、‘卡迪夫’、‘利物浦’、‘纽卡斯尔’、‘利物浦
似乎您的挑战是将刮取限制在
Premier League
<h5>
及其相关内容。你知道吗这个HTML看起来非常扁平,结构上没有区别,所以看起来最好的办法是从h5开始遍历上一个和下一个兄弟姐妹,h5本身很容易定位:
相关问题 更多 >
编程相关推荐