Python 2和Python 3上的请求库崩溃

try: response = requests.get(url) except Exception as error: return False if response.encoding == None: soup = bs4.BeautifulSoup(response.text) # This is line 809 else: soup = bs4.BeautifulSoup(response.text, from_encoding=response.encoding)

@property def text(self): """Content of the response, in unicode. if Response.encoding is None and chardet module is available, encoding will be guessed. """ # Try charset from content-type content = None encoding = self.encoding # Fallback to auto-detected encoding. if self.encoding is None: if chardet is not None: encoding = chardet.detect(self.content)['encoding'] # Decode unicode from given encoding. try: content = str(self.content, encoding, errors='replace') # This is line 809 except LookupError: # A LookupError is raised if the encoding was not found which could # indicate a misspelling or similar mistake. # # So we try blindly encoding. content = str(self.content, errors='replace') return content

1条回答

网友

1楼 · 发布于 2024-04-19 20:06:03

这意味着服务器没有为标头中的内容发送编码，chardet库也无法确定内容的编码。实际上，您故意测试是否缺少编码；如果没有可用的编码，为什么要尝试获取解码文本？在

您可以尝试将解码留给BeautifulSoup解析器：

if response.encoding is None:
   soup = bs4.BeautifulSoup(response.content)

而且不需要将编码传递给BeautifulSoup，因为如果.text没有失败，那么您使用的是Unicode，beautifulGroup无论如何都会忽略编码参数：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章