我的HTTP服务器接收来自单个客户端套接字的HTTP数据时,应该预期数据会乱序吗?
我正在实现自己的HTTP服务器:
import socket
import threading
import queue
import ssl
from manipulator.parser import LineBuffer,LoggableHttpRequest
class SocketServer:
"""
Basic Socket Server in python
"""
def __init__(self,host,port,max_threads,ssl_context:ssl.SSLContext=None):
print("Create Server For Http")
self.host = host
self.port = port
self.server_socket = self.initSocket()
self.max_threads = max_threads
self.request_queue = queue.Queue()
self.ssl_context=None
if(ssl_context != None):
print("Initialise SSL context")
self.ssl_context = ssl_context
def initSocket(self):
return socket.socket(socket.AF_INET, socket.SOCK_STREAM)
def __accept(self):
self.server_socket.listen(5)
while True:
try:
client_socket, client_address = self.server_socket.accept()
if self.ssl_context is not None :
print(self.ssl_context)
client_socket = self.ssl_context.wrap_socket(client_socket, server_side=True)
self.request_queue.put((client_socket, client_address))
except:
print("Error Occured")
def __handle(self):
while True:
client_socket, address = self.request_queue.get()
print("Address",address)
try:
# Read HTTP Request
# Log Http Request
# Manipulate Http Request
# Forward or respond
buffer = LineBuffer()
request = HttpRequest(self.db)
buffer.pushData(client_socket.recv(2048))
line = buffer.getLine()
if(line is not None):
request.parse(line)
content = '<html><body>Hello World</body></html>\r\n'.encode()
headers = f'HTTP/1.1 200 OK\r\nContent-Length: {len(content)}\r\nContent-Type: text/html\r\n\r\n'.encode()
client_socket.sendall(headers + content)
finally:
client_socket.shutdown(socket.SHUT_RDWR)
client_socket.close()
self.request_queue.task_done()
def __initThreads(self):
for _ in range(self.max_threads):
threading.Thread(target=self.__handle, daemon=True).start()
def start(self):
self.server_socket.bind((self.host, self.port))
self.__initThreads()
self.__accept()
我这样做的原因是我想尽快记录和分析进来的HTTP请求。而且,很多第三方库需要C语言的绑定,我想避免这种情况。
到目前为止,我做了一个行分割器,可以把请求分割成\r\n:
class LineBuffer:
def __init__(self):
self.buffer = b''
def pushData(self,line):
self.buffer += str.encode(line)
def getLine(self):
if b'\r\n' in self.buffer:
line,sep,self.buffer = self.buffer.partition(b'\r\n')
return line+sep
return None
接下来,我想解析每一行,并把它们转化成一个表示HTTP请求的对象,这样我就可以以流的方式进一步处理它们:
class HttpRequest:
def __init__(self,db):
self.headers={} #ParsedHeaderrs
self.body="" #Http Body
self.version=None
self.method=None
self.id=None
self.raw=""
class HttpParser:
def __init__(self,db):
self.db = db
self.currentRequest=None
def parse(line):
# do parsing here
return
我最担心的情况是,客户端会发送两个请求:
请求1:
GET / HTTP/1.1\r\n
HOST lala1.com \r\n
请求2:
POST /file HTTP/1.1\r\n
HOST lala2.com \r\n
\r\n
Qm9QUVM5NDMuLnEvXVN7O2E=
fDMpQjcpOlFodClgOGUzYQ==
NVgvNipmU1d3YFgtLFUhQiM=
MiZwSk0zKno9TkVxNyZFL3s=
NEhGJXZ7OGciOE8mYF5JNA==
dVlJLzpdKlUjXl4tcEpufQ==
XVgiXCdjQyckMjY/Ikt6Rw==
alksJlZ+XHFzQSYqaHlHIztt
YiRnPjdye0gvanV3ZGxaZkI=
MjgwTX0uYHw6M295RS52UDM=
YU0yQ2dQLmJUQVpCNS89PWJB
Ti10MHJBTjAqUFUlIU0sMyRN
但是我的服务器接收到的顺序是:
GET / HTTP/1.1\r\n
POST /file HTTP/1.1\r\n
HOST lala1.com \r\n
\r\n\r\nQm9QUVM5ND
HOST lala2.com \r\n
MuLnEvXVN7O2E=
fDMpQjcpOlFodClgOGUzYQ==
NVgvNipmU1d3YFgtLFUhQiM=
MiZwSk0zKno9TkVxNyZFL3s=
NEhGJXZ7OGciOE8mYF5JNA==
dVlJLzpdKlUjXl4tcEpufQ==
XVgiXCdjQyckMjY/Ikt6Rw==
alksJlZ+XHFzQSYqaHlHIztt
YiRnPjdye0gvanV3ZGxaZkI=
MjgwTX0uYHw6M295RS52UDM=
YU0yQ2dQLmJUQVpCNS89PWJB
Ti10MHJBTjAqUFUlIU0sMyRN
\r\n
在我的情况下,这种情况可行吗?还是说TCP套接字会自己处理数据的顺序?
1 个回答
1
在HTTP/1中,请求和响应是串行的,也就是说,在同一个TCP连接中,多个请求或响应不会交错出现,响应必须按照请求的顺序返回,并且都在同一个TCP连接上。
而在HTTP/2中情况就不同了,请求和响应被拆分成多个帧,这些帧可以在同一个TCP连接中交错发送。因此,多个请求和响应可以同时传输,响应的顺序不需要和请求的顺序一致。但是你现在的代码只支持HTTP/1,也就是说,它根本不尝试解析HTTP/2这种完全不同的格式,只能处理HTTP/1的响应。
有关协议的详细信息,请查看相关标准。