我无法打开存在的网站

2024-04-20 07:50:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我得到一个错误,使我相信我的程序无法找到一个网站,我知道存在。网站是

https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207

我的代码看起来像

from urllib import request as u_r

def strip_webite():

  with u_r.urlopen("https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207") as f:
      pass

if __name__ == "__main__":
  strip_webite()

我得到的错误是

  File "database_management.py", line 19, in <module>
    strip_webite()
  File "database_management.py", line 15, in strip_webite
    with u_r.urlopen("https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207") as f:
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 404: Not Found

Tags: inpyrequestlibusrlocallineframework
1条回答
网友
1楼 · 发布于 2024-04-20 07:50:27

看起来Transfermarkt正在用Python的urllib库发送的默认User-Agent字符串阻止来自bot的请求,尽管它在其robots库中没有提到这方面的任何内容。你知道吗

这似乎意味着他们不介意我们刮他们,但他们更希望我们宣布我们是谁。你知道吗

要使用urllib执行此操作,请执行以下操作:

from urllib import request as u_r

def strip_webite():

  request = u_r.Request("https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207")
  request.add_header('User-Agent', 'my-cool-app')
  with u_r.urlopen(request) as f:
      pass

if __name__ == "__main__":
  strip_webite()

相关问题 更多 >