单变量样条插值全为NaN但没有不良参数迹象
我写了一个函数,用来创建一个样条插值函数,这个函数会检查输入数据是否有错误。如果数据通过了检查,就会生成一个点的数量是原始数量倍数的样条。但是,尽管我的错误检查没有发现问题,这个函数似乎还是不工作。
我检查了以下几个方面:
- x和y的数据大小必须相同
- 生成的点的倍数必须大于零,<1表示样条中的点更少,>1表示样条中的点更多,1表示点的数量不变
- 样条的阶数必须在1到5之间(包括1和5)
- x的数据必须是严格递增的
- x或y的数据中至少有一个包含NaN(不是一个数字)
我觉得这个检查列表很全面,应该能避免生成全是NaN的样条,但当我输入数据时,似乎没有任何错误被触发。
这是我的代码:
##
# Univariate Spline Interpolation
##
## This function interpolates the data by creating multiple times the amount of points in the data set and fitting a spline to it
## Input:
# dataX - X axis that you corresponds to dataset
# dataY - Y axis of data to fit spline on (must be same size as dataX)
# multiple - the multiplication factor, default is 2 ( <1 - Less points, 1 - same amount of points, >1 - more points)
# order - order of spline, default is 4 (3 - Cubic, 4 - Quartic)
## Output
# spline - interpolation spline object to be used for peak detection
# splinedDataX - splined X Axis
# splinedDataY - splined Y Axis
#import scipy modules for spling creation and class methods
from scipy.interpolate import UnivariateSpline, LSQUnivariateSpline
#import numpy module for linear spacing creation
from numpy import linspace, NaN
def univariate_spline_interpolation(dataX, dataY, multiple=2, order=4):
#Libraries
from scipy.interpolate import UnivariateSpline, LSQUnivariateSpline
from myUnivariateSpline import MyUnivariateSpline
#Find sizes of x and y axis for comparison and multiple
sizeX = len(dataX)
sizeY = len(dataY)
#Error catching
if(sizeX != sizeY):
print "Data X axis and Y axis must have same size"
return
if(multiple <= 0):
print "Multiple must be greater than 0"
return
if(order < 1 or order >5):
print "Order must be 1 <= order <= 5"
return
#check for monotonic increasting function
for indx, val in enumerate(dataX): #set first value as largest value, need to have all following increase
if indx == 0:
high = val
highIndx = indx
continue
#if the curent value is lower than
if val <= high:
print "timestamp out of order"
print "value at ", highIndx, "is ", high
print "value at ", indx, "is ", val
break
#check for NaN in x and y
for indx, val in enumerate(dataY):
if(val == NaN):
print "Value in Data Y at indx", indx, "is NaN"
return
for indx, val in enumerate(dataX):
if(val == NaN):
print "Value in Data X at indx", indx, "is NaN"
return
#Create Spline
spline = UnivariateSpline(dataX, dataY, k=order, s=0)
#Create new axis based on numPoints
numPoints = sizeX * multiple #Find number of points for spline
startPt = dataX[0] #find value of first point on x axis
endPt = dataX[-1] #find value of last point on x axis
splinedDataX = linspace(startPt, endPt, numPoints) #create evenly spaced points on axis based on start, end, and number of desired data points
#Create Y axis of splined Data
splinedDataY = spline(splinedDataX) #Create new Y axis with numPoints etnries of data splined to fit the original data
return spline, splinedDataX, splinedDataY
这是我尝试输入的三个数据集中的一个,但这三个数据集都导致返回的splinedDataY字段全是NaN。interpDataY的大小和对应的interpDataX是一样的。我不知道是什么原因导致它只返回NaN。
这两组数据都是pandas的Series,但即使把它们转换成列表也没有解决问题。
我不知道这是否相关,但当我打印这个列表时,里面全是nan,而不是NaN。
1 个回答
0
第152行和153行的数据是一样的,都是5.86868286133。这可能就是问题的原因。我不太清楚你的检查代码为什么没有发现这个,因为我对Python不太熟悉。不过我想你应该能找出这个问题的所在。