Python: httplib getresponse 多次调用 recv() 的问题

5 投票

1 回答

763 浏览

提问于 2025-04-17 13:38

getresponse 在读取 HTML 请求的头部时，会发出很多次 recv 调用。实际上，它对每一个字节都发出一次 recv，这导致了很多系统调用。我们该如何优化这个过程呢？

我在一台 Ubuntu 机器上用 strace 工具进行了验证。

示例代码：

conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/index.html")
r1 = conn.getresponse()

strace 输出：

sendto(3, "HEAD /index.html HTTP/1.1\r\nHost:"..., 78, 0, NULL, 0) = 78
recvfrom(3, "H", 1, 0, NULL, NULL)      = 1
recvfrom(3, "T", 1, 0, NULL, NULL)      = 1
recvfrom(3, "T", 1, 0, NULL, NULL)      = 1
recvfrom(3, "P", 1, 0, NULL, NULL)      = 1
recvfrom(3, "/", 1, 0, NULL, NULL)      = 1
...

性能优化 ubuntu 网络请求系统调用 strace http 头部

1 个回答

r = conn.getresponse(buffering=True)

在Python 3.1及以上版本中，没有buffering这个参数（它是默认的）。

回答于 2025-04-17 由 Python大师

分享举报

Python: httplib getresponse 多次调用 recv() 的问题

1 个回答

撰写回答