Python中多处理的内存问题

2024-05-08 16:06:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试在python程序中使用其他内核。下面是我的代码的基本结构/逻辑:

import multiprocessing as mp
import pandas as pd
import gc

def multiprocess_RUN(param):
    result = Analysis_Obj.run(param)
    return result

class Analysis_Obj():

    def __init__(self, filename):
        self.DF = pd.read_csv(filename)
        return

    def run_Analysis(self, param):
        # Multi-core option
        pool = mp.Pool(processes=1)
        run_result = pool.map(multiprocess_RUN, [self, param])

        # Normal option
        run_result = self.run(param)

        return run_result

    def run(self, param):

        # Let's say I have written a function to count the frequency of 'param' in the target file
        result = count(self.DF, param)
        return result

if __name__ == "__main__":
    files = ['file1.csv', 'file2.csv']
    params = [1,2,3,4]
    results = []

    for i in range(0,len(files)):
        analysis = Analysis_Obj(files[i])
        for j in range(0,len(params)):
            result = analysis.run_Analysis(params[j])
            results.append(result)
        del result
    del analysis
    gc.collect()

如果我注释掉“多核选项”并运行“普通选项”,一切正常。但即使我使用processes=1运行“多核选项”,当for循环从第二个文件开始时,我也会得到一个Memory Error。我特意设置了它,以便在每个for循环中创建和删除一个分析对象,以便从内存中清除已处理的文件。显然,这并不奏效。如果你能给我一些建议,我将不胜感激。在

干杯

编辑:

以下是我在终端中收到的错误消息:

^{pr2}$

Tags: csvruninimportselfobjforreturn

热门问题