擅长:python、mysql、java
<p>这不是一个https问题,只是你试图抓取的页面有一些文件访问限制。当您期望异常时,最好处理它们。在这种情况下,所有的文件链接都可能断开或无法访问。在</p>
<p>尝试按如下方式处理异常:</p>
<pre><code>import requests
import urllib
import re
from bs4 import BeautifulSoup
page = requests.get("https://www.nationalgrid.com/uk/electricity/market-and-operational-data/data-explorer")
soup = BeautifulSoup(page.content, 'html.parser')
fileDownloader = urllib.URLopener()
mainLocation = "https://www.nationalgrid.com"
for document in soup.find_all('a', class_='download'):
document_name = document["title"]
document_url = mainLocation+document["href"]
try:
fileDownloader.retrieve(document_url, "forecasted-demand-files/"+document_name)
except IOError as e:
print('failed to download: {}'.format(document_url))
</code></pre>