在scipy中插值np.nan值

1 投票

1 回答

1122 浏览

提问于 2025-04-18 05:12

我想通过插值的方法来填补np.nan这个数据值，插值的依据是加粗部分显示的元素。它们是与np.nan在其他维度中相同位置的元素。

import numpy as np
from scipy.interpolate import interp1d
data = np.array([[[3, 2, 1, 3, 2],
                  [**np.nan**, 1, 1, 4, 4],
                  [4, 2, 3, 3, 4],
                  [1, 1, 4, 1, 5],
                  [2, 4, 5, 2, 1]],

                 [[6, 7, 10, 6, 6],
                  [**5**, 9, 8, 6, 9],
                  [6, 10, 9, 8, 10],
                  [6, 8, 7, 10, 8],
                  [10, 9, 9, 10, 8]],

                 [[12, 14, 12, 15, 15],
                  [**21**, 11, 14, 14, 11],
                  [13, 13, 16, 15, 11],
                  [14, 15, 14, 16, 14],
                  [13, 15, 11, 11, 14]]])

result = interp1d(data, kind='cubic')
print result

这样处理的结果是

TypeError: __init__() takes at least 3 arguments (3 given)

那么，最好的做法是什么呢？因为我需要处理非常大的数组，所以我在寻找一种高效的方法。谢谢。

数据处理插值科学计算 nan值数组优化

1 个回答

你的问题有点太宽泛了，插值其实不需要从5*5的矩阵中获取信息，只需要其他维度中同一个单元格的值就可以了？如果是这样的话，那还是太宽泛了，因为有很多不同的插值工具可以满足不同的需求。我觉得最简单的“最近邻”方法可能是个不错的起点，虽然scipy.interpolate的文档对某些人来说可能不太友好：

In [1]:

data = np.array([[[3, 2, 1, 3, 2],
                  [np.nan, 1, 1, 4, 4],
                  [4, 2, 3, 3, 4],
                  [1, 1, 4, 1, 5],
                  [2, 4, 5, 2, 1]],

                 [[6, 7, 10, 6, 6],
                  [5, 9, 8, 6, 9],
                  [6, 10, 9, 8, 10],
                  [6, 8, 7, 10, 8],
                  [10, 9, 9, 10, 8]],

                 [[12, 14, 12, 15, 15],
                  [21, 11, 14, 14, 11],
                  [13, 13, 16, 15, 11],
                  [14, 15, 14, 16, 14],
                  [13, 15, 11, 11, 14]]])
In [2]:

data1=data.reshape((3,-1))
In [3]:
#the one you want to interpolate
data1[:,(np.isnan(data.reshape((3,-1))).any(0))]
Out[3]:
array([[ nan],
       [  5.],
       [ 21.]])
In [4]:
#the other 'good' data points
data1[:,~(np.isnan(data.reshape((3,-1))).any(0))]
Out[4]:
array([[  3.,   2.,   1.,   3.,   2.,   1.,   1.,   4.,   4.,   4.,   2.,
          3.,   3.,   4.,   1.,   1.,   4.,   1.,   5.,   2.,   4.,   5.,
          2.,   1.],
       [  6.,   7.,  10.,   6.,   6.,   9.,   8.,   6.,   9.,   6.,  10.,
          9.,   8.,  10.,   6.,   8.,   7.,  10.,   8.,  10.,   9.,   9.,
         10.,   8.],
       [ 12.,  14.,  12.,  15.,  15.,  11.,  14.,  14.,  11.,  13.,  13.,
         16.,  15.,  11.,  14.,  15.,  14.,  16.,  14.,  13.,  15.,  11.,
         11.,  14.]])
In [5]:

import scipy.interpolate as si
In [6]:

Q=si.NearestNDInterpolator(data1[:,~(np.isnan(data.reshape((3,-1))).any(0))][[1,2]].T, 
                           data1[:,~(np.isnan(data.reshape((3,-1))).any(0))][0])
In [8]:
#the first value is the answer, the 2nd is the index of the nearest neighbor.
Q.tree.query([5,21])
Out[8]:
(6.082762530298219, 3)

回答于 2025-04-18 由 Python大师

分享举报

在scipy中插值np.nan值

1 个回答

撰写回答