如何在python中最大化多线程的性能

def longest_substring(s, t, score, j): match = difflib.SequenceMatcher(None, s, t).get_matching_blocks() char_num = [] for i in match: char_num.append(i.size) score[j] = max(char_num) for i in range(m): score = [None]*n s = df.loc[i, 'ocr'] threads = [threading.Thread(target=longest_substring, args=(s, db.loc[j, 'ocr'], score, j)) for j in range(n)] for t in threads: t.start() for t in threads: t.join()

1条回答

网友

1楼 · 发布于 2024-04-20 07:58:02

并行处理可能有点棘手，下面我给你几个解决方案：

第一：Python的GIL（全局解释锁）您看到的用法可能是使用的内核数量有限。这是因为多线程在默认情况下不会同时工作，这是因为Python的GIL。您可以查看详细信息here。你知道吗

A global interpreter lock (GIL) is a mechanism used in computer-language interpreters to synchronize the execution of threads so that only one native thread can execute at a time. An interpreter that uses GIL always allows exactly one thread to execute at a time, even if run on a multi-core processor.
Applications running on implementations with a GIL can be designed to use separate processes to achieve full parallelism, as each process has its own interpreter and in turn has its own GIL. Otherwise, the GIL can be a significant barrier to parallelism.

为了最大限度地使用Python中的MultiProcessing。这将分配您的任务的核心数量，从而利用最大的CPU。你知道吗

第二：您的问题大小在数据大小和CPU使用量之间有一个折衷，如果线程自动生成，那么CPU使用量将尽可能减少，从而保持较长的执行时间。你可以在上面有一个命令，利用所有的CPU核心，通过玩数据大小，看看你的最佳值和你应该什么时候开始缩放。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章