如何绕过HTTP错误403:Forbidden withurllib.请求使用Python 3

2024-06-16 10:50:53 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,不是每次,但有时当我试图访问LSE代码时,我会被抛出每一个恼人的HTTP错误403:Forbidden消息。在

任何人都知道我如何只使用标准的python模块来克服这个问题(遗憾的是,没有漂亮的汤)。在

import urllib.request

url = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html"
infile = urllib.request.urlopen(url) # Open the URL
data = infile.read().decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1

print(data) # Print the data to the screen

不过,偶尔也会出现这样的错误:

^{pr2}$

链接到所有正常模块的列表:https://docs.python.org/3.4/py-modindex.html

提前致谢。在


Tags: 模块the代码httpurldatarequesthtml
2条回答

这可能是由于mod\u安全性。您需要将URL作为浏览器打开,而不是pythonurllib来进行欺骗。在

在这里,我更正了你的代码:

import urllib.request

url = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html"

# Open the URL as Browser, not as python urllib
page=urllib.request.Request(url,headers={'User-Agent': 'Mozilla/5.0'}) 
infile=urllib.request.urlopen(page).read()
data = infile.decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1

print(data) # Print the data to the screen

接下来,可以使用BeautifulSoup来刮取HTML。在

看来你的费率是有限的。试着睡一觉然后再试一次。例如:

import urllib
import urllib.request
from time import sleep

LSE_URL = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html"
WAIT_PERIOD = 15

def stock_data_reader():
    stock_data = get_stock_data()
    while True:
        if not stock_data:
            sleep(WAIT_PERIOD) # sleep for a while until next retry
            stock_data = get_stock_data()                
        else:
            break

    print(stock_data) # do something with stock data



def get_stock_data():
    try:
        infile = urllib.request.urlopen(LSE_URL) # Open the URL
    except urllib.error.HTTPError as http_err:
        print("Error: %s" % http_err)
        return None
    else:
        data = infile.read().decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1
        return data


stock_data_reader()

相关问题 更多 >