我正在尝试在python程序中使用其他内核。下面是我的代码的基本结构/逻辑:
import multiprocessing as mp
import pandas as pd
import gc
def multiprocess_RUN(param):
result = Analysis_Obj.run(param)
return result
class Analysis_Obj():
def __init__(self, filename):
self.DF = pd.read_csv(filename)
return
def run_Analysis(self, param):
# Multi-core option
pool = mp.Pool(processes=1)
run_result = pool.map(multiprocess_RUN, [self, param])
# Normal option
run_result = self.run(param)
return run_result
def run(self, param):
# Let's say I have written a function to count the frequency of 'param' in the target file
result = count(self.DF, param)
return result
if __name__ == "__main__":
files = ['file1.csv', 'file2.csv']
params = [1,2,3,4]
results = []
for i in range(0,len(files)):
analysis = Analysis_Obj(files[i])
for j in range(0,len(params)):
result = analysis.run_Analysis(params[j])
results.append(result)
del result
del analysis
gc.collect()
如果我注释掉“多核选项”并运行“普通选项”,一切正常。但即使我使用processes=1
运行“多核选项”,当for循环从第二个文件开始时,我也会得到一个Memory Error
。我特意设置了它,以便在每个for循环中创建和删除一个分析对象,以便从内存中清除已处理的文件。显然,这并不奏效。如果你能给我一些建议,我将不胜感激。在
干杯
编辑:
以下是我在终端中收到的错误消息:
^{pr2}$
目前没有回答
相关问题 更多 >
编程相关推荐