Numba在频率计数上比纯Python慢

2024-04-26 22:04:56 发布

您现在位置:Python中文网/ 问答频道 /正文

给定一个数据矩阵,其中包含以2D numpy数组表示的离散条目,我试图计算一些特征(列)的观测频率,只查看一些实例(矩阵的行)。在

我可以很容易地用numpy在完成一些漂亮的切片后将bincount应用于每个切片。在纯Python中使用外部数据结构作为计数累加器,在C风格中是一个双循环。在

import numpy

import numba

try:
    from time import perf_counter
except:
    from time import time
    perf_counter = time


def estimate_counts_numpy(data,
                          instance_ids,
                          feature_ids):
    """
    WRITEME
    """
    #
    # slicing the data array (probably memory consuming)
    curr_data_slice = data[instance_ids, :][:, feature_ids]

    estimated_counts = []
    for feature_slice in curr_data_slice.T:
        counts = numpy.bincount(feature_slice)
        #
        # checking just for the all 0 case:
        # this is not stable for not binary datasets TODO: fix it
        if counts.shape[0] < 2:
            counts = numpy.append(counts, [0], 0)
        estimated_counts.append(counts)

    return estimated_counts


@numba.jit(numba.types.int32[:, :](numba.types.int8[:, :],
                                   numba.types.int32[:],
                                   numba.types.int32[:],
                                   numba.types.int32[:],
                                   numba.types.int32[:, :]))
def estimate_counts_numba(data,
                          instance_ids,
                          feature_ids,
                          feature_vals,
                          estimated_counts):
    """
    WRITEME
    """

    #
    # actual counting
    for i, feature_id in enumerate(feature_ids):
        for instance_id in instance_ids:
            estimated_counts[i][data[instance_id, feature_id]] += 1

    return estimated_counts


if __name__ == '__main__':
    #
    # creating a large synthetic matrix, testing for performance
    rand_gen = numpy.random.RandomState(1337)
    n_instances = 2000
    n_features = 2000
    large_matrix = rand_gen.binomial(1, 0.5, (n_instances, n_features))
    #
    # random indexes too
    n_sample = 1000
    rand_instance_ids = rand_gen.choice(n_instances, n_sample, replace=False)
    rand_feature_ids = rand_gen.choice(n_features, n_sample, replace=False)
    binary_feature_vals = [2 for i in range(n_features)]
    #
    # testing
    numpy_start_t = perf_counter()

    e_counts_numpy = estimate_counts_numpy(large_matrix,
                                           rand_instance_ids,
                                           rand_feature_ids)
    numpy_end_t = perf_counter()
    print('numpy done in {0} secs'.format(numpy_end_t - numpy_start_t))

    binary_feature_vals = numpy.array(binary_feature_vals)
    #
    #
    curr_feature_vals = binary_feature_vals[rand_feature_ids]
    #
    # creating a data structure to hold the slices
    # (with numba I cannot use list comprehension?)
    # e_counts_numba = [[0 for val in range(feature_val)]
    #                   for feature_val in
    #                   curr_feature_vals]
    e_counts_numba = numpy.zeros((n_sample, 2), dtype='int32')
    numba_start_t = perf_counter()

    estimate_counts_numba(large_matrix,
                          rand_instance_ids,
                          rand_feature_ids,
                          binary_feature_vals,
                          e_counts_numba)
    numba_end_t = perf_counter()
    print('numba done in {0} secs'.format(numba_end_t - numba_start_t))

以下是我在运行上述代码时得到的时间:

^{pr2}$

我想说的是,当我尝试用numba应用jit时,我的实现速度甚至更慢,所以我非常怀疑我把事情搞砸了。在


Tags: instanceinnumpyidsfordatacounterfeature
1条回答
网友
1楼 · 发布于 2024-04-26 22:04:56

您的函数运行缓慢的原因是Numba已经返回到对象模式来编译循环。在

有两个问题:

  1. Numba还不支持多维数组的链式索引,因此需要重写以下内容:

estimated_counts[i][data[instance_id, feature_id]]

在这方面:

estimated_counts[i, data[instance_id, feature_id]]

  1. 显式类型签名不正确。所有的输入数组实际上都是int64,而不是int8/int32。您可以依赖Numba的自动JIT来检测参数类型并编译正确的版本,而不是修复您的签名。您只需将decorator更改为@numba.jit。如果不想包括编译时间,只需确保在基准测试之前调用该函数一次。在

有了这些变化,我基准Numba比NumPy快15%左右。在

相关问题 更多 >