Python,请求,线程,Python请求关闭其套接字的速度有多快?

2024-05-15 03:31:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用Python请求执行操作。这是我的代码:

import threading
import resource
import time
import sys

#maximum Open File Limit for thread limiter.
maxOpenFileLimit = resource.getrlimit(resource.RLIMIT_NOFILE)[0] # For example, it shows 50.

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1

number = 0 # Count of complete actions

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append(threading.Thread(target=threadAction, args=(a1,a2)))


for item in ThreadActions:
    # But I can't do more than 50 Threads at once, because of maxOpenFileLimit.
    while True:
        # Thread limiter, analogue of BoundedSemaphore.
        if (int(threading.activeCount()) < threadLimiter):
            item.start()
            break
        else:
            continue

for item in ThreadActions:
    item.join()

但问题是在我得到50个线程之后,Thread limiter开始等待某个线程完成它的工作。问题就在这里。在scrit进入限制器后,lsof -i|grep python|wc -l显示的活动连接远远少于50个。但在限制器之前,它显示了所有<;=50个过程。为什么会这样?还是我应该用请求.关闭()而不是请求.会话()以防止它使用已经打开的插座?在


Tags: toinimportnumberforadapteritemrequests
1条回答
网友
1楼 · 发布于 2024-05-15 03:31:29

你的限制器是一个很紧的循环,它占用了你大部分的处理时间。使用线程池来限制工作线程的数量。在

import multiprocessing.pool

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1 # DEBUG: This doesn't update number and wouldn't be
                              # thread safe if it did

number = 0 # Count of complete actions

pool = multiprocessing.pool.ThreadPool(50, chunksize=1)

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append((a1,a2))

pool.map(ThreadActons)
pool.close()

相关问题 更多 >

    热门问题