我得到一个错误,使我相信我的程序无法找到一个网站,我知道存在。网站是
https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207
我的代码看起来像
from urllib import request as u_r
def strip_webite():
with u_r.urlopen("https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207") as f:
pass
if __name__ == "__main__":
strip_webite()
我得到的错误是
File "database_management.py", line 19, in <module>
strip_webite()
File "database_management.py", line 15, in strip_webite
with u_r.urlopen("https://www.transfermarkt.com/marco-reus/verletzungen/spieler/35207") as f:
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
看起来Transfermarkt正在用Python的
urllib
库发送的默认User-Agent
字符串阻止来自bot的请求,尽管它在其robots库中没有提到这方面的任何内容。你知道吗这似乎意味着他们不介意我们刮他们,但他们更希望我们宣布我们是谁。你知道吗
要使用urllib执行此操作,请执行以下操作:
相关问题 更多 >
编程相关推荐