Python线程池执行器和封闭连接

2024-03-28 10:34:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在执行一些线程化请求,从API中提取JSON数据。会有5到15个并发调用发生。我写的这个脚本在我运行它的大多数情况下都能正常工作。未来被推到一个列表中,然后我可以遍历返回的数据并执行需要执行的数据框架工作(返回的是JSON,但我认为这对于这里的内容并不重要。)

当其中一个请求比其他请求花费的时间长得多时,它就会失败。冒出来的是:

  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connectionpool.py", line 421, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\site-packages\urllib3\connectionpool.py", line 416, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\http\client.py", line 1322, in getresponse
    response.begin()
  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\http\client.py", line 303, in begin
    version, status, reason = self._read_status()
  File "C:\Users\me\AppData\Local\Programs\Python\Python38-32\lib\http\client.py", line 272, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

我认为这意味着,因为第一个请求(有时可以在不到一秒钟的时间内完成)很早以前就完成了,所以最长的请求(可能是2分钟或更多)在完成之前就终止了连接。当我在Postman中单独运行这个长请求时,它会很好地完成。只有当我面前有这些简短的请求时,它才会在我身上消失

有没有办法保持联系?或者不使用线程之间共享的连接

下面是代码。看起来很接近这个

import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
from timeit import default_timer as timer
import datetime

def download(url):
    starttime = datetime.datetime.now()
    with requests.get(url) as response:
        runtime = round(datetime.timedelta.total_seconds(datetime.datetime.now() - starttime))

        status_code = response.status_code
        reason = response.reason
        text = response.text            

    print(f"Downloaded {url} (in {str(datetime.timedelta(seconds=runtime))})")

    return status_code, reason, text    

urls = [
    "http://www.google.com",
    "http://stackoverflow.com"
]   

futures = []
with ThreadPoolExecutor(max_workers=10) as executor:
    for url in urls:        
        futures.append( {"endpoint": url, "result": executor.submit(download, url) } )  

for future in futures:
    result = future['result'].result()
    print( result[0], result[1] )

Tags: inpyhttpdatetimeresponseliblocalstatus