另一个S型回归方程问题
我昨天发过一个早期版本的帖子,但是我发现无法在那个帖子里添加这个新版本,因为好像有人把那个帖子关闭了,不能编辑了,所以我在这里发一个新的帖子。
我有下面这个脚本,它做了以下几件事:
1.) 为S型数据绘制最佳拟合曲线。
2.) 根据新的最大和最小坐标调整数据的大小。
3.) 计算并绘制调整大小后数据的新的最佳拟合曲线。
步骤1和步骤2似乎都没问题,但步骤3却不行。如果你运行这个脚本,你会发现它为调整大小后的数据绘制了一条完全不正确的曲线。
有没有人能告诉我如何修改下面的代码,让它为调整大小后的数据创建并绘制一条真正的最佳拟合S型曲线? 这个过程需要在不同的最大和最小值范围内都能重复进行。
我发现问题可能出在New_p这个变量上,它在下面这行代码中定义:
New_p, New_cov, New_infodict, New_mesg, New_ier = scipy.optimize.leastsq(
residuals,New_p_guess,args=(NewX,NewY),full_output=1,warning=True)
但我不知道如何更深入地解决这个问题。我觉得问题可能和全局变量和局部变量的区别有关,但也可能是其他原因。
这是我完整代码的当前草稿:
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize
def GetMinRR(age):
MaxHR = 208-(0.7*age)
MinRR = (60/MaxHR)*1000
return MinRR
def sigmoid(p,x):
x0,y0,c,k=p
y = c / (1 + np.exp(-k*(x-x0))) + y0
return y
def residuals(p,x,y):
return y - sigmoid(p,x)
def resize(x,y,xmin=0.0,xmax=1.0,ymin=0.0,ymax=1.0):
# Create local variables
NewX = [t for t in x]
NewY = [t for t in y]
# If the mins are greater than the maxs, then flip them.
if xmin>xmax: xmin,xmax=xmax,xmin
if ymin>ymax: ymin,ymax=ymax,ymin
#----------------------------------------------------------------------------------------------
# The rest of the code below re-calculates all the values in x and then in y with these steps:
# 1.) Subtract the actual minimum of the input x-vector from each value of x
# 2.) Multiply each resulting value of x by the result of dividing the difference
# between the new xmin and xmax by the actual maximum of the input x-vector
# 3.) Add the new minimum to each value of x
# Note: I wrote in x-notation, but the identical process is also repeated for y
#----------------------------------------------------------------------------------------------
# Subtracts right operand from the left operand and assigns the result to the left operand.
# Note: c -= a is equivalent to c = c - a
NewX -= x.min()
# Multiplies right operand with the left operand and assigns the result to the left operand.
# Note: c *= a is equivalent to c = c * a
NewX *= (xmax-xmin)/NewX.max()
# Adds right operand to the left operand and assigns the result to the left operand.
# Note: c += a is equivalent to c = c + a
NewX += xmin
# Subtracts right operand from the left operand and assigns the result to the left operand.
# Note: c -= a is equivalent to c = c - a
NewY -= y.min()
# Multiplies right operand with the left operand and assigns the result to the left operand.
# Note: c *= a is equivalent to c = c * a
NewY *= (ymax-ymin)/NewY.max()
# Adds right operand to the left operand and assigns the result to the left operand.
# Note: c += a is equivalent to c = c + a
NewY += ymin
return (NewX,NewY)
# Declare raw data for use in creating logistic regression equation
x = np.array([821,576,473,377,326],dtype='float')
y = np.array([255,235,208,166,157],dtype='float')
# Call resize() function to re-calculate coordinates that will be used for equation
MinRR=GetMinRR(50)
MaxRR=1200
minLVET=(y[4]/x[4])*MinRR
maxLVET=(y[0]/x[0])*MaxRR
#x,y=resize(x,y,xmin=0.3, ymin=0.3)
NewX,NewY=resize(x,y,xmin=MinRR,xmax=MaxRR,ymin=minLVET,ymax=maxLVET)
print 'x is: ',x
print 'y is: ',y
print 'NewX is: ',NewX
print 'NewY is: ',NewY
# p_guess is the starting estimate for the minimization
p_guess=(np.median(x),np.median(y),1.0,1.0)
New_p_guess=(np.median(NewX),np.median(NewY),1.0,1.0)
# Calls the leastsq() function, which calls the residuals function with an initial
# guess for the parameters and with the x and y vectors. The full_output means that
# the function returns all optional outputs. Note that the residuals function also
# calls the sigmoid function. This will return the parameters p that minimize the
# least squares error of the sigmoid function with respect to the original x and y
# coordinate vectors that are sent to it.
p, cov, infodict, mesg, ier = scipy.optimize.leastsq(
residuals,p_guess,args=(x,y),full_output=1,warning=True)
New_p, New_cov, New_infodict, New_mesg, New_ier = scipy.optimize.leastsq(
residuals,New_p_guess,args=(NewX,NewY),full_output=1,warning=True)
# Define the optimal values for each element of p that were returned by the leastsq() function.
x0,y0,c,k=p
print('''Reference data:\
x0 = {x0}
y0 = {y0}
c = {c}
k = {k}
'''.format(x0=x0,y0=y0,c=c,k=k))
New_x0,New_y0,New_c,New_k=New_p
print('''New data:\
New_x0 = {New_x0}
New_y0 = {New_y0}
New_c = {New_c}
New_k = {New_k}
'''.format(New_x0=New_x0,New_y0=New_y0,New_c=New_c,New_k=New_k))
# Create a numpy array of x-values
xp = np.linspace(x.min(), x.max(), x.max()-x.min())
New_xp = np.linspace(NewX.min(), NewX.max(), NewX.max()-NewX.min())
# Return a vector pxp containing all the y values corresponding with the x-values in xp
pxp=sigmoid(p,xp)
New_pxp=sigmoid(New_p,New_xp)
# Plot the results
plt.plot(x, y, '>', xp, pxp, 'g-')
plt.plot(NewX, NewY, '^',New_xp, New_pxp, 'r-')
plt.xlabel('x')
plt.ylabel('y',rotation='horizontal')
plt.grid(True)
plt.show()
1 个回答
2
试试这个:
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize
def GetMinRR(age):
MaxHR = 208-(0.7*age)
MinRR = (60/MaxHR)*1000
return MinRR
def sigmoid(p,x):
x0,y0,c,k=p
y = c / (1 + np.exp(-k*(x-x0))) + y0
return y
def residuals(p,x,y):
return y - sigmoid(p,x)
def resize(arr,lower=0.0,upper=1.0):
# Create local copy
result=arr.copy()
# If the mins are greater than the maxs, then flip them.
if lower>upper: lower,upper=upper,lower
#----------------------------------------------------------------------------------------------
# The rest of the code below re-calculates all the values in x and then in y with these steps:
# 1.) Subtract the actual minimum of the input x-vector from each value of x
# 2.) Multiply each resulting value of x by the result of dividing the difference
# between the new xmin and xmax by the actual maximum of the input x-vector
# 3.) Add the new minimum to each value of x
#----------------------------------------------------------------------------------------------
# Subtracts right operand from the left operand and assigns the result to the left operand.
# Note: c -= a is equivalent to c = c - a
result -= result.min()
# Multiplies right operand with the left operand and assigns the result to the left operand.
# Note: c *= a is equivalent to c = c * a
result *= (upper-lower)/result.max()
# Adds right operand to the left operand and assigns the result to the left operand.
# Note: c += a is equivalent to c = c + a
result += lower
return result
# Declare raw data for use in creating logistic regression equation
x = np.array([821,576,473,377,326],dtype='float')
y = np.array([255,235,208,166,157],dtype='float')
# Call resize() function to re-calculate coordinates that will be used for equation
MinRR=GetMinRR(50)
MaxRR=1200
# x[-1] returns the last value in x
minLVET=(y[-1]/x[-1])*MinRR
maxLVET=(y[0]/x[0])*MaxRR
print(MinRR, MaxRR)
#x,y=resize(x,y,xmin=0.3, ymin=0.3)
NewX=resize(x,lower=MinRR,upper=MaxRR)
NewY=resize(y,lower=minLVET,upper=maxLVET)
print 'x is: ',x
print 'y is: ',y
print 'NewX is: ',NewX
print 'NewY is: ',NewY
# p_guess is the starting estimate for the minimization
p_guess=(np.median(x),np.min(y),np.max(y),0.01)
New_p_guess=(np.median(NewX),np.min(NewY),np.max(NewY),0.01)
# Calls the leastsq() function, which calls the residuals function with an initial
# guess for the parameters and with the x and y vectors. The full_output means that
# the function returns all optional outputs. Note that the residuals function also
# calls the sigmoid function. This will return the parameters p that minimize the
# least squares error of the sigmoid function with respect to the original x and y
# coordinate vectors that are sent to it.
p, cov, infodict, mesg, ier = scipy.optimize.leastsq(
residuals,p_guess,args=(x,y),full_output=1,warning=True)
New_p, New_cov, New_infodict, New_mesg, New_ier = scipy.optimize.leastsq(
residuals,New_p_guess,args=(NewX,NewY),full_output=1,warning=True)
# Define the optimal values for each element of p that were returned by the leastsq() function.
x0,y0,c,k=p
print('''Reference data:\
x0 = {x0}
y0 = {y0}
c = {c}
k = {k}
'''.format(x0=x0,y0=y0,c=c,k=k))
New_x0,New_y0,New_c,New_k=New_p
print('''New data:\
New_x0 = {New_x0}
New_y0 = {New_y0}
New_c = {New_c}
New_k = {New_k}
'''.format(New_x0=New_x0,New_y0=New_y0,New_c=New_c,New_k=New_k))
# Create a numpy array of x-values
xp = np.linspace(x.min(), x.max(), x.max()-x.min())
New_xp = np.linspace(NewX.min(), NewX.max(), NewX.max()-NewX.min())
# Return a vector pxp containing all the y values corresponding with the x-values in xp
pxp=sigmoid(p,xp)
New_pxp=sigmoid(New_p,New_xp)
# Plot the results
plt.plot(x, y, '>', xp, pxp, 'g-')
plt.plot(NewX, NewY, '^',New_xp, New_pxp, 'r-')
plt.xlabel('x')
plt.ylabel('y',rotation='horizontal')
plt.grid(True)
plt.show()
你提的另一个相关问题没有被关闭,看起来你注册了两个账号,而stackoverflow不让你编辑另一个问题,因为它不认为 这个用户 和 那个用户 是同一个人。
在上面的代码中,我主要做的就是修改了 New_p_guess
。找到合适的初始猜测值有点像艺术。如果能用算法解决,scipy就不会让你自己去做了。稍微分析一下数据会有帮助,同时对你的数据有一定的“感觉”也很重要。提前知道解决方案大概是什么样子,以及在问题的背景下哪些值是合理的,这些都能帮助你选择合适的参数。(简单来说,我就是通过猜测选择了 k=0.01。)