Python中多处理措施的意外结果

2024-04-20 09:06:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用并行遗传算法和线性遗传算法来加速Rosenbrock函数的计算。我开始研究线程和多处理Python库,并找到了一种方法,但是(总是有一个“但是”)我在评估中发现了完全出乎意料的行为

我测量了[5-500000]范围内群体的2D Rosenbrock(或任何更大维度)计算,每个群体10次测试。怎么了

1这个过程比iter算法快得多,甚至比完全错误的计算时间少50%

你知道为什么我在这两者之间有很多收获吗?一个过程应该在与iter算法相似的时间内进行计算(甚至更糟,因为运行该过程需要资源,对吗?)

您可以在link上看到完整的结果('n'表示Rosenbrock的维度)

#!/usr/bin/python
import scipy
import multiprocessing
from timeit import default_timer as timer
import math

def rosenbrock(x_1, x_2):
    return 100*(x_2-x_1**2)**2 + (1-x_1)**2

def n_rosenbrock(X):
    sum_r = 0
    for i in range(len(X)-1):
        sum_r += rosenbrock(X[i], X[i+1])
    return sum_r

def evaluation(shared_population, shared_fitnesses, nr_of_genes, x_1, x_2):
    for i in range(x_1, x_2, nr_of_genes):
        result = n_rosenbrock(shared_fitnesses[i:i+nr_of_genes])
        shared_fitnesses[int(i/nr_of_genes)] = result

if __name__ == '__main__':
    min_x = -5
    max_x = 5
    cores = 1

    POP_SIZES = [5, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 20000, 25000, 50000, 100000, 150000, 200000, 250000, 300000, 350000, 400000, 450000, 500000]
    iters_time = []
    proc_eval_time = []


    for idp, pop_size in enumerate(POP_SIZES):
        for nr_of_genes in range(2, 3):
            population = scipy.random.uniform(min_x, max_x, (pop_size * nr_of_genes))
            shared_population = multiprocessing.Array('f', scipy.reshape(population, pop_size*nr_of_genes), lock=False)
            shared_fitnesses = multiprocessing.Array('f', pop_size, lock=False)

            indexes = [int(pop_size/cores)] * cores
            for x in range(int(pop_size%cores)):
                indexes[x] += 1
            test_c = 10
            process_eval_time = 0
            process_sel_time = 0

            iter_time = 0

            print("Iter", idp)
            iter_population = scipy.reshape(population, pop_size*nr_of_genes)
            iter_fitnesses = scipy.zeros(pop_size)
            for _ in range(test_c):
                iter_timer_start = timer()
                for i in range(0,len(iter_population),nr_of_genes):
                    result = n_rosenbrock(iter_population[i:i+nr_of_genes])
                    iter_fitnesses[int(i/nr_of_genes)] = result
                iter_timer_stop = timer()
                iter_time += (iter_timer_stop-iter_timer_start)
            iters_time.append(iter_time/test_c)

            print("Process", idp)
            for _ in range(test_c):
                processes = scipy.empty(cores, dtype=multiprocessing.Process)
                for idx in range(cores):
                    x_1 = sum(indexes[:idx]) * nr_of_genes
                    x_2 = x_1 + indexes[idx] * nr_of_genes
                    args = (shared_population, shared_fitnesses, nr_of_genes, x_1, x_2)
                    process = multiprocessing.Process(target=evaluation, args=args)
                    processes[idx] = process
                process_eval_start = timer()
                for p in processes:
                    p.start()
                for p in processes:
                    p.join()
                process_eval_stop = timer()
                process_eval_time += (process_eval_stop-process_eval_start)
            proc_eval_time.append(process_eval_time/test_c)

    print("iters_time", iters_time)
    print("process_eval_time", proc_eval_time)

Tags: ofinfortimeevalrangeprocesspop
1条回答
网友
1楼 · 发布于 2024-04-20 09:06:56

看起来你的比较可能是无效的。我建议组织起来 代码如下所示:

def do_iter(x, y, z):
    ...

def do_multiproc(x, y, z):
    ...

for x in population_sizes:
    timeit.timeit('do_iter(x, y, z)')
    timeit.timeit('do_multiproc(x, y, z)')

这段代码显然无法运行。关键是每个方法涉及的所有设置和处理都应该完全封装在该方法的do_x函数中。do_x函数应采用相同的参数,否则应尽可能相似

此外,看起来您要对每个arg组合进行10次测试,这可能不足以获得准确的计时timeit.timeit()默认为1000000次迭代

相关问题 更多 >