如何使用最接近的值在3D numpy数组中结合前向和后向填充NAN?

2024-05-16 17:44:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在处理时间序列数据(卫星图像),希望用最接近的非缺失值填充缺失值(例如,云)。我已经找到了非常有用的帖子Most efficient way to forward-fill NaN values in numpy array,它回答了许多问题。阵列的前向和后向填充效果非常好,速度也非常快。现在,我想将这两种方法合并为一种方法,其中只选择“最近的”值

import numpy as np

def np_ffill(arr, axis):
    idx_shape = tuple([slice(None)] + [np.newaxis] * (len(arr.shape) - axis - 1))
    fwd_idx = np.where(~np.isnan(arr), np.arange(arr.shape[axis])[idx_shape], 0)
    fwd_idx = np.maximum.accumulate(fwd_idx, axis=axis)
    slc = [np.arange(k)[tuple([slice(None) if dim==i else np.newaxis
        for dim in range(len(arr.shape))])]
        for i, k in enumerate(arr.shape)]
    slc[axis] = fwd_idx
    return arr[tuple(slc)]

def np_bfill(arr, axis):
    idx_shape = tuple([slice(None)] + [np.newaxis] * (len(arr.shape) - axis - 1))
    bwd_idx = np.where(~np.isnan(arr), np.arange(arr.shape[axis])[idx_shape], arr.shape[axis] - 1)
    bwd_idx = np.minimum.accumulate(bwd_idx[:,:,::-1], axis=axis)[:,:,::-1]
    slc = [np.arange(k)[tuple([slice(None) if dim==i else np.newaxis
        for dim in range(len(arr.shape))])]
        for i, k in enumerate(arr.shape)]
    slc[axis] = bwd_idx
    return arr[tuple(slc)]

def random_array(shape):
    choices = [1, 2, 3, 4, np.nan]
    out = np.random.choice(choices, size=shape)
    return out
    
ra = random_array((10, 10, 5)) # for testing, I assume 5 images with the size of 10x10 pixels
ffill = np_ffill(ra,2) # the filling should only be applied on the last axis (2)
bfill = np_bfill(ra,2)

到目前为止,我唯一的想法是比较指数fwd_idxbwd_idx,以确定哪个位置更接近要填补的位置。然而,这将意味着再次创建FOR循环。这不是也有一个矢量化的numpy方法吗?非常感谢你的帮助


Tags: innoneforlennpslicearrshape