我怎么才能刮到其他标签?

2024-05-16 22:22:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试着用Bs4做网页垃圾处理Sportium odds

问题是,bs4只捕捉足球赔率标签,我想从所有的运动赔率,我希望有人能帮助我解决这个问题

这是我的密码:

url="https://sports.sportium.es/es/apuestasparahoy"
try:
 page = urllib.request.urlopen(url)
except:
 print("An error occured.")
soup = BeautifulSoup(page, "html.parser")

Tags: httpsurl网页密码espage标签足球
1条回答
网友
1楼 · 发布于 2024-05-16 22:22:53

该页面通过Javascript/Ajax动态加载数据。但是如果你打开Firefox/Chrome开发者工具,你会看到页面在哪里以及如何发出请求

此示例将打印每个选项卡中的数据:

import requests
from bs4 import BeautifulSoup

main_url = 'https://sports.sportium.es/es/apuestasparahoy'
url = 'https://sports.sportium.es/web_nr'

frag = BeautifulSoup(requests.get(main_url).text, 'html.parser').select_one('.fragment.inplay.expander[data-frag_desc][data-src_code="UPCOMING"]')['data-frag_desc']

data = {
    "key":"CMS.web.cms_handlers.update_fragments",
    "frag": frag,
    "play_mode":"F",
}

soup = BeautifulSoup(requests.post(url, data=data).json()[0], 'html.parser')

sports = {'FOOT':'Football',
'BASK':'Basketball',
'BASE':'Baseballl',
'CRIC':'Cricket',
'DART':'Darts',
'ESPS':'E-Sports',
'AMFO':'American Football',
'ICEH':'Ice Hockey',
'VOLL':'Volleyball'}

for div in soup.select('div[id^="upcoming-tab"]'):

    print('Sport :', sports[div['id'].replace('upcoming-tab-', '')])

    for tr in div.select('tr'):
        print(tr.get_text(separator='|', strip=True).split('|')[1:])

    print('-' * 80)

印刷品:

Sport : Football
['23:00', '22 Dic', 'Humble Lions', '6/5', '2.20', '+120', 'X', '9/5', '2.80', '+180', 'Harbour View', '9/4', '3.25', '+225', '+34', 'st']
['00:30', '23 Dic', 'Blooming Santa Cruz', '4/11', '1.36', '-275', 'X', '7/2', '4.60', '+350', 'Guabira Montero', '11/2', '6.50', '+550', '+40', 'st']
['01:00', '23 Dic', 'Mount Pleasant FA', '4/6', '1.66', '-150', 'X', '9/4', '3.25', '+225', 'Cavalier', '7/2', '4.50', '+350', '+35', 'st']
['02:00', '23 Dic', 'Arnett Gardens', '5/6', '1.83', '-120', 'X', '21/10', '3.10', '+210', 'Dunbeholden FC', '3/1', '4.00', '+300', '+35', 'st']
['17:00', '23 Dic', 'PAOK', '2/9', '1.22', '-450', 'X', '17/4', '5.25', '+425', 'Atromitos Athinon', '11/1', '12.00', '+1100', '+38', 'st']
['17:00', '23 Dic', 'Giresunspor', '10/11', '1.90', '-110', 'X', '21/10', '3.10', '+210', 'Altinordu', '13/5', '3.60', '+260', '+33', 'st']
['18:00', '23 Dic', 'Atiker Konyaspor 1922', '17/10', '2.70', '+170', 'X', '2/1', '3.00', '+200', 'Trabzonspor', '7/5', '2.40', '+140', '+151', 'st']
['18:00', '23 Dic', 'Denizlispor', '23/10', '3.30', '+230', 'X', '21/10', '3.10', '+210', 'Alanyaspor', '21/20', '2.05', '+105', '+130', 'st']
['20:45', '23 Dic', 'Blackburn', '4/5', '1.80', '-125', 'X', '5/2', '3.50', '+250', 'Wigan', '10/3', '4.40', '+333', '+152', 'st']
                                        
Sport : Basketball
['23:00', '22 Dic', 'TCU', '13/20', '1.65', '-154', 'Xavier', '21/20', '2.05', '+105', '+3']
['23:00', '22 Dic', 'San Jose State', '13/10', '2.30', '+130', 'Cal Riverside', '8/15', '1.53', '-188', '+3']
['23:00', '22 Dic', 'Boise State', '8/11', '1.72', '-138', 'Georgia Tech', '19/20', '1.95', '-106', '+3']
['23:00', '22 Dic', 'Fuerza Regia de Monterrey', '1/10', '1.10', '-1000', 'Correcaminos UAT Victoria', '5/1', '6.00', '+500', '+7', 'st']

... and so on.

相关问题 更多 >