单变量样条插值全为NaN但没有不良参数迹象

1 投票
1 回答
854 浏览
提问于 2025-04-19 15:49

我写了一个函数,用来创建一个样条插值函数,这个函数会检查输入数据是否有错误。如果数据通过了检查,就会生成一个点的数量是原始数量倍数的样条。但是,尽管我的错误检查没有发现问题,这个函数似乎还是不工作。

我检查了以下几个方面:

  • x和y的数据大小必须相同
  • 生成的点的倍数必须大于零,<1表示样条中的点更少,>1表示样条中的点更多,1表示点的数量不变
  • 样条的阶数必须在1到5之间(包括1和5)
  • x的数据必须是严格递增的
  • x或y的数据中至少有一个包含NaN(不是一个数字)

我觉得这个检查列表很全面,应该能避免生成全是NaN的样条,但当我输入数据时,似乎没有任何错误被触发。

这是我的代码:

##
# Univariate Spline Interpolation
##

## This function interpolates the data by creating multiple times the amount of points in the data set and fitting a spline to it
## Input:
# dataX - X axis that you corresponds to dataset
# dataY - Y axis of data to fit spline on (must be same size as dataX)
# multiple - the multiplication factor, default is 2 ( <1 - Less points, 1 - same amount of points, >1 - more points)
# order - order of spline, default is 4 (3 - Cubic, 4 - Quartic)
## Output
# spline - interpolation spline object to be used for peak detection
# splinedDataX - splined X Axis
# splinedDataY - splined Y Axis

#import scipy modules for spling creation and class methods
from scipy.interpolate import UnivariateSpline, LSQUnivariateSpline

#import numpy module for linear spacing creation
from numpy import linspace, NaN

def univariate_spline_interpolation(dataX, dataY, multiple=2, order=4):

    #Libraries
    from scipy.interpolate import UnivariateSpline, LSQUnivariateSpline
    from myUnivariateSpline import MyUnivariateSpline

    #Find sizes of x and y axis for comparison and multiple
    sizeX = len(dataX)
    sizeY = len(dataY)

    #Error catching
    if(sizeX != sizeY):
        print "Data X axis and Y axis must have same size"
        return

    if(multiple <= 0):
        print "Multiple must be greater than 0"
        return

    if(order < 1 or order >5):
        print "Order must be 1 <= order <= 5"
        return

    #check for monotonic increasting function
    for indx, val in enumerate(dataX): #set first value as largest value, need to have all following increase
        if indx == 0:
            high = val
            highIndx = indx
            continue
        #if the curent value is lower than 
        if val <= high:
            print "timestamp out of order"
            print "value at ", highIndx, "is ", high
            print "value at ", indx, "is ", val
            break

    #check for NaN in x and y
    for indx, val in enumerate(dataY):
        if(val == NaN):
            print "Value in Data Y at indx", indx, "is NaN"
            return

    for indx, val in enumerate(dataX):
        if(val == NaN):
            print "Value in Data X at indx", indx, "is NaN"
            return

    #Create Spline
    spline = UnivariateSpline(dataX, dataY, k=order, s=0)   

    #Create new axis based on numPoints
    numPoints = sizeX * multiple   #Find number of points for spline
    startPt = dataX[0]   #find value of first point on x axis
    endPt = dataX[-1]   #find value of last point on x axis
    splinedDataX = linspace(startPt, endPt, numPoints)   #create evenly spaced points on axis based on start, end, and number of desired data points

    #Create Y axis of splined Data
    splinedDataY = spline(splinedDataX)   #Create new Y axis with numPoints etnries of data splined to fit the original data

    return spline, splinedDataX, splinedDataY

这是我尝试输入的三个数据集中的一个,但这三个数据集都导致返回的splinedDataY字段全是NaN。interpDataY的大小和对应的interpDataX是一样的。我不知道是什么原因导致它只返回NaN。

X数据 Y数据

这两组数据都是pandas的Series,但即使把它们转换成列表也没有解决问题。

我不知道这是否相关,但当我打印这个列表时,里面全是nan,而不是NaN。

1 个回答

0

第152行和153行的数据是一样的,都是5.86868286133。这可能就是问题的原因。我不太清楚你的检查代码为什么没有发现这个,因为我对Python不太熟悉。不过我想你应该能找出这个问题的所在。

撰写回答