我正在努力找出如何使用Scrapy Python刮取JSON响应。我能够在同一站点的不同页面上成功地获取JSON。我将感谢任何帮助
我如何在“tournamentGroup”(即id、姓名)以及年份、头衔等中获取值
部分代码:
start_url = 'https://api.wtatennis.com/tennis/tournaments/?page=0&pageSize=100&excludeLevels=ITF&from=2020-09-01&to=2020-09-30'
with urllib.request.urlopen(start_url) as start_url:
json_obj = start_url.read()
rank_list = json.loads(json_obj)
for item in rank_list:
rank_data = []
tourney_id = item['content']['id']
tourney_year = item['year']
rank_data = [tourney_id, tourney_year]
cur.execute("""insert into wta_rankings(tourney_id, tourney_year)
values(%s, %s)
ON CONFLICT DO NOTHING"""
,(rank_data))
conn.commit()
cur.close()
JSON:
{
"pageInfo":{
"page":0,
"numPages":0,
"pageSize":100,
"numEntries":10
},
"content":[
{
"tournamentGroup":{
"id":2023,
"name":"Prague 125K",
"level":"125K",
"metadata":null
},
"year":2020,
"title":"Prague Open",
"startDate":"2020-08-29",
"endDate":"2020-09-06",
"surface":"Clay",
"inOutdoor":"O",
"city":"PRAGUE",
"country":"Czech Republic",
"singlesDrawSize":128,
"doublesDrawSize":32,
"prizeMoney":3125000,
"prizeMoneyCurrency":"USD",
"liveScoringId":"2023"
},
试试这个:
这将为您提供(这只是一个示例,您可以获得您想要的任何字段):
如果您难以通过JSON进行“导航”,只需将响应内容复制到联机JSON formatter,单击
wrench
图标进行修复,然后Format / Beautify
相关问题 更多 >
编程相关推荐