并行写入对象字典

2 投票

1 回答

3076 浏览

提问于 2025-04-18 05:52

我有一个包含多个对象的字典，我想用多进程的方式来填充这个字典。这个代码片段会同时多次运行“Run”。

Data=dict()
for i in range:
   Data[i]=dataobj(i) #dataobj is a class I have defined elsewhere
   proc=Process(target=Run, args=(i, Data[i]))
   proc.start()

其中“Run”会进行一些模拟，并把结果保存到dataobj对象里。

def Run(i, out):
    [...some code to run simulations....]
    out.extract(file)

我的代码创建了一个对象字典，然后在这个字典中并行修改这些对象。这种做法可行吗？还是说每次修改共享字典中的对象时，我都需要获取一个锁？

线程安全锁机制多进程并行处理对象字典

1 个回答

简单来说，当你使用多进程时，你的每个进程都会共享原始字典对象的“副本”，所以它们会填充不同的内容。多进程包为你处理的是在进程之间传递Python对象的消息，这样可以让事情变得简单一些。

一个好的设计思路是让主进程负责填充字典，而让它的子进程来处理具体的工作。然后使用队列在子进程和主进程之间交换数据。

作为一个一般的设计思路，这里有一些可以做的事情：

from queue import Queue

queues = [Queue(), Queue()]

def simulate(qin, qout):
    while not qin.empty():
        data = qin.pop()
        # work with the data
        qout.put(data)
    # when the queue is empty, the process ends

Process(target=simulate, args=(queues[0][0],queues[0][1])).start()
Process(target=simulate, args=(queues[1][0],queues[1][1])).start()

processed_data_list = []

# first send the data to be processed to the children processes
while data.there_is_more_to_process():
    # here you have to adapt to your context how you want to split the load between your processes
    queues[0].push(data.pop_some_data())
    queues[1].push(data.pop_some_data())

# then for each process' queue 
for qin, qout in queues:
    # you populate your output data list (or dict or whatever)
    while not qout.empty:
        processed_data_list.append(qout.pop())
# here again, you have to adapt to your context how you handle the data sent
# back from the children processes.

不过，这只是一个设计思路，因为这段代码有一些设计缺陷，这些缺陷在处理真实数据和处理函数时会自然得到解决。

回答于 2025-04-18 由 Python大师

分享举报

并行写入对象字典

1 个回答

撰写回答