在scipy中插值np.nan值
我想通过插值的方法来填补np.nan这个数据值,插值的依据是加粗部分显示的元素。它们是与np.nan在其他维度中相同位置的元素。
import numpy as np
from scipy.interpolate import interp1d
data = np.array([[[3, 2, 1, 3, 2],
[**np.nan**, 1, 1, 4, 4],
[4, 2, 3, 3, 4],
[1, 1, 4, 1, 5],
[2, 4, 5, 2, 1]],
[[6, 7, 10, 6, 6],
[**5**, 9, 8, 6, 9],
[6, 10, 9, 8, 10],
[6, 8, 7, 10, 8],
[10, 9, 9, 10, 8]],
[[12, 14, 12, 15, 15],
[**21**, 11, 14, 14, 11],
[13, 13, 16, 15, 11],
[14, 15, 14, 16, 14],
[13, 15, 11, 11, 14]]])
result = interp1d(data, kind='cubic')
print result
这样处理的结果是
TypeError: __init__() takes at least 3 arguments (3 given)
那么,最好的做法是什么呢?因为我需要处理非常大的数组,所以我在寻找一种高效的方法。谢谢。
1 个回答
1
你的问题有点太宽泛了,插值其实不需要从5*5的矩阵中获取信息,只需要其他维度中同一个单元格的值就可以了?如果是这样的话,那还是太宽泛了,因为有很多不同的插值工具可以满足不同的需求。我觉得最简单的“最近邻”方法可能是个不错的起点,虽然scipy.interpolate
的文档对某些人来说可能不太友好:
In [1]:
data = np.array([[[3, 2, 1, 3, 2],
[np.nan, 1, 1, 4, 4],
[4, 2, 3, 3, 4],
[1, 1, 4, 1, 5],
[2, 4, 5, 2, 1]],
[[6, 7, 10, 6, 6],
[5, 9, 8, 6, 9],
[6, 10, 9, 8, 10],
[6, 8, 7, 10, 8],
[10, 9, 9, 10, 8]],
[[12, 14, 12, 15, 15],
[21, 11, 14, 14, 11],
[13, 13, 16, 15, 11],
[14, 15, 14, 16, 14],
[13, 15, 11, 11, 14]]])
In [2]:
data1=data.reshape((3,-1))
In [3]:
#the one you want to interpolate
data1[:,(np.isnan(data.reshape((3,-1))).any(0))]
Out[3]:
array([[ nan],
[ 5.],
[ 21.]])
In [4]:
#the other 'good' data points
data1[:,~(np.isnan(data.reshape((3,-1))).any(0))]
Out[4]:
array([[ 3., 2., 1., 3., 2., 1., 1., 4., 4., 4., 2.,
3., 3., 4., 1., 1., 4., 1., 5., 2., 4., 5.,
2., 1.],
[ 6., 7., 10., 6., 6., 9., 8., 6., 9., 6., 10.,
9., 8., 10., 6., 8., 7., 10., 8., 10., 9., 9.,
10., 8.],
[ 12., 14., 12., 15., 15., 11., 14., 14., 11., 13., 13.,
16., 15., 11., 14., 15., 14., 16., 14., 13., 15., 11.,
11., 14.]])
In [5]:
import scipy.interpolate as si
In [6]:
Q=si.NearestNDInterpolator(data1[:,~(np.isnan(data.reshape((3,-1))).any(0))][[1,2]].T,
data1[:,~(np.isnan(data.reshape((3,-1))).any(0))][0])
In [8]:
#the first value is the answer, the 2nd is the index of the nearest neighbor.
Q.tree.query([5,21])
Out[8]:
(6.082762530298219, 3)