如何解码chrome的HTTP请求头?

2024-05-15 12:28:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用python和sockets来创建一个简单的服务器。当我使用request.recv(1024)接收头时,我无法解码编码的数据。这在Firefox上运行得很好。我用utf-8作为解码的编解码器。你知道吗

chrome使用不同的编码还是什么? 错误是:

Traceback (most recent call last): File "<pyshell#7>", line 1, in <module> head.decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 8: invalid


Tags: 数据in服务器编码request错误编解码器chrome
1条回答
网友
1楼 · 发布于 2024-05-15 12:28:59

我不知道你到底在做什么(没有代码显示),但一个HTTP消息不是UTF-8放在首位。引用from the standard

A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP message as a stream of Unicode characters, without regard for the specific encoding, creates security vulnerabilities due to the varying ways that string processing libraries handle invalid multibyte character sequences that contain the octet LF (%x0A).

later it says关于HTTP头中字段的值:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

因此,首先不要使用UTF-8对HTTP消息进行解码,因为它不是用来解释的。你知道吗

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 8: invalid

\0xfc在解释为ISO-8859-1时是ü,这可能是发送方想要的解释。它不是有效的UTF-8,但正如我所说的,HTTP消息首先不应该被视为UTF-8。你知道吗

相关问题 更多 >