通过消除回路来提高速度

2024-06-16 10:45:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下问题。下面的代码成功地线性拟合了50到400个样本的数据(我从来没有超过400个样本,前50个样本的质量非常糟糕)。在第三维中,我的值为7,而第四维的值最多为10000,因此这个循环“解决方案”将花费大量时间。如何不使用for循环并减少运行时?感谢您的帮助(我对Python非常陌生)

from sklearn.linear_model import TheilSenRegressor
import numpy as np
#ransac = linear_model.RANSACRegressor()
skip_v=50#number of values to be skipped
N=400
test_n=np.reshape(range(skip_v, N),(-1,1))
f_n=7
d4=np.shape(data)
a6=np.ones((f_n,d4[3]))
b6=np.ones((f_n,d4[3]))
for j in np.arange(d4[3]):
    for i in np.arange(f_n):
        theil = TheilSenRegressor(random_state=0).fit(test_n,np.log(data[skip_v:,3,i,j]))
        a6[i,j]=theil.coef_
        b6[i,j]=theil.intercept_

Tags: testimportfordatamodelnponeslinear
1条回答
网友
1楼 · 发布于 2024-06-16 10:45:52

您可以使用多处理来并行处理循环。以下代码不起作用。它只是演示了如何做到这一点。只有当你的数字真的很大时,它才有用。否则,按顺序执行会更快

from sklearn.linear_model import TheilSenRegressor
import numpy as np
import multiprocessing as mp
from itertools import product

def worker_function(input_queue, output_queue, skip_v, test_n, data):
    for task in iter(input_queue.get, 'STOP'):
        i = task[0]
        j = task[1]
        theil = TheilSenRegressor(random_state=0).fit(test_n,np.log(data[skip_v:,3,i,j]))
        output_queue.put([i, j, theil])

if __name__ == "__main__":
    # define data here 

    f_n = 7
    d4 = np.shape(data)
    skip_v = 50

    N=400
    test_n=np.reshape(range(skip_v, N),(-1,1))

    input_queue = mp.Queue()
    output_queue = mp.Queue()

    # here you create all combinations of j and i of your loop
    list1 = range(f_n)
    list2 = range(d4[3])
    list3 = [list1, list2]
    tasks = [p for p in product(*list3)]

    numProc = 4

    # start processes
    process = [mp.Process(target=worker_function,
                          args=(input_queue, output_queue,
                                skip_v, test_n, data)) for x in range(numProc)]

    for p in process:
       p.start()

    # queue tasks
    for i in tasks:
       input_queue.put(i)

    # signal workers to stop after tasks are all done
    for i in range(numProc):
       input_queue.put('STOP')

    # get the results
    for i in range(len(tasks)):
        res = output_queue.get(block=True) # wait for results
        a6[res[0], res[1]] = res[2].coef_
        b6[res[0], res[1]] = res[2].intercept_

相关问题 更多 >