joblib的死锁问题

2024-04-25 20:24:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用joblib并行地将numpy数组转换为bcolz-carray,但是在写入carray时,出现了死锁。在

简化代码在这里。在

import cv2
import pandas as pd
import numpy as np
from tqdm import tqdm
import bcolz
from tqdm import tqdm
from pathlib import Path
from joblib import Parallel, delayed

class BcolzConverter():
    """
    Save the output from a iterator without loading all images into memory.
    Does not return anything, instead writes data to disk.
    :it: iterator returns image and label
    :data_dir: The folder name to store the bcolz array representing the features in.
    :labels_dir: The folder name to store the bcolz array representing the labels in.
    """    
    def __init__(self, it, data_dir, labels_dir):
        self.it = it
        self.data_dir = data_dir
        self.labels_dir = labels_dir

    def convert(self):
        for directory in [self.data_dir, self.labels_dir]:
            if not Path(directory).exists():
                Path(directory).mkdir(parents=True)

        d, l = next(self.it)
        data = bcolz.carray(d, rootdir=self.data_dir, mode=self.mode)
        labels = bcolz.carray(l, rootdir=self.labels_dir, mode=self.mode)

        Parallel(n_jobs=-1)(delayed(self._process)(
            element, data, labels) for element in it)

        data.flush()
        labels.flush()

    @staticmethod
    def _process(element, data, labels):
        d, l = element
        data.append(d)
        labels.append(l)

converter = BcolzConverter(it, '../input/bcolz/data', '../input/bcolz/label', num_workers=4)
converter.convert()

以上程序永远不会停止,除非用户暂停。 输出在这里。程序堆栈由于服务员阿奎尔()

^{pr2}$

这似乎是由于多个进程试图写入carray对象。 如何在没有死锁的情况下实现这一点?在


Tags: thepathinfromimportselfdatalabels