检索JSON queryresult后代码中断

2024-04-27 03:23:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我曾试图解决这个问题,但在Python中出现错误后(对我来说)无法继续下一步

我正在查询此网站:https://w.wiki/msg 我通过更改每个循环的城市来调整查询,城市在[listElements]中。 当我有一个像“阿瓦拉丹”这样的城市时,密码就会中断。(基本上可以用硬编码代替listElement)

试着在里面放一个睡眠计时器并不能解决这个问题(我想我是在试着打开一个请求)

错误如下:

Traceback (most recent call last):
  File "C:/Users/xxx/PycharmProjects/pythonProject3/xxx.py", line 30, in <module>
    data = r.json()
  File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\ProgramData\Anaconda3\envs\pythonProject3\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

代码(我编辑了它,所以它可以被复制,到目前为止,像这样的代码没有任何意义,经过一段时间的循环,它只是中断):

 import requests
listPops = [[], []]
url = 'https://query.wikidata.org/sparql'
zaehler = -1
for i in range(100):
    zaehler = zaehler + 1
    #print(str(listElements[1][i]))
    #query = r"SELECT ?population WHERE { SERVICE wikibase:mwapi {bd:serviceParam mwapi:search '" + str(listElements[1][i]) + "' . bd:serviceParam mwapi:language 'en' . bd:serviceParam wikibase:api 'EntitySearch' . bd:serviceParam wikibase:endpoint 'www.wikidata.org' . bd:serviceParam wikibase:limit 1 . ?item wikibase:apiOutputItem mwapi:item .} ?item wdt:P1082 ?population} "
    query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
          bd:serviceParam mwapi:search '""" + "Awaradam" + """'.    
          bd:serviceParam mwapi:language "en" . 
          bd:serviceParam wikibase:api "EntitySearch" .
          bd:serviceParam wikibase:endpoint "www.wikidata.org" .
          bd:serviceParam wikibase:limit 1 .
          ?item wikibase:apiOutputItem mwapi:item .
      }
      ?item wdt:P1082 ?population
    }
    """
    r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
    #time.sleep(5)
    data = r.json()
    try:
        #population = r['results']['bindings'][0]['population']['value']
        if data['results']['bindings'][0]['population']['value']:
            population = data['results']['bindings'][0]['population']['value']
            print(str(zaehler) + ": " + "Population in " + str(listElements[1][i]) + ": " + f"{int(population):,}")
            listPops[0].append(str(listElements[1][i]))
            listPops[1].append(population)
    except:
        continue

print('Finished scrape.')

Tags: inpyjsonparamvalueservicelineitem
2条回答

回溯意味着返回的结果不是JSON。如果远程服务器不想发送JSON,则无法让它发送JSON,但在发生这种情况时,可以跳过此项(或者尝试其他查询,如果您能找到一个可行的查询)

try:
    data = r.json()
except json.decoder.JSONDecodeError as err:
    logging.warning('Not JSON: %s (result %r)', err, r.text)
    continue

您将不得不import logging(或者只是print警告)和import json如果您还没有这样做

你的毛毯try/except也可以工作(只需将try移到故障线上方),但它的形式确实很糟糕。见Why is "except: pass" a bad programming practice?。实际上,它屏蔽了这样一个事实:Wikidata中没有针对Awaradam的结果,而您正在运行一个徒劳的循环,试图一次又一次地获取它们

下面是一个快速而肮脏的解决方案:

import requests
import time
import json

listPops = [[], []]
listElements = [[], ['Bangalore', 'Hyderabad', 'Awaradam', 'Rawalpindi']]
url = 'https://query.wikidata.org/sparql'

for i, city in enumerate(listElements[1]):
    query = """ SELECT ?population WHERE { SERVICE wikibase:mwapi {
          bd:serviceParam mwapi:search '""" + city + """'.    
          bd:serviceParam mwapi:language "en" . 
          bd:serviceParam wikibase:api "EntitySearch" .
          bd:serviceParam wikibase:endpoint "www.wikidata.org" .
          bd:serviceParam wikibase:limit 1 .
          ?item wikibase:apiOutputItem mwapi:item .
      }
      ?item wdt:P1082 ?population
    }
    """
    r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
    time.sleep(5)
    try:
        data = r.json()
    except json.decoder.JSONDecodeError as err:
        print('Not JSON: %s (result %r)' % (err, r.text))
    assert 'results' in data
    assert 'bindings' in data['results']
    if not data['results']['bindings']:
        #logging.warning('No results for %s', city)
        print('No results for', city)
        continue
    assert data['results']['bindings'], 'type %s %r' % (type(data['results']['bindings']), data['results']['bindings'])
    assert 'population' in data['results']['bindings'][0]
    assert 'value' in data['results']['bindings'][0]['population']
    if data['results']['bindings'][0]['population']['value']:
        population = data['results']['bindings'][0]['population']['value']
        print(f"{i}: Population in {city}: {int(population):,}")
        listPops[0].append(str(listElements[1][i]))
        listPops[1].append(population)

正如@tripleee所提到的,问题在于您的查询没有返回有效的JSON(而是返回HTML消息)。服务器应该通知您查询的status。要处理它,您应该检查请求的状态:

r = requests.get(url, params={'format': 'json', 'query': query}, timeout=10)
if r.status_code != 200:
  handle_your_error(r)

例如,在运行您的示例后,出现HTTP错误429:请求太多

相关问题 更多 >