Numpy Polyfit或其他多维数组X和Y的拟合 방법
我有两个很大的多维数组:Y
里存着五十万个物体的三组测量数据(比如说 shape=(500000,3)
),而 X
的形状也是一样,但里面是这些测量数据的位置。
一开始,我想对每一行,也就是每个物体的数据,进行多项式拟合。我知道逐行处理数组会很慢,但目前我正在做的是:
fit = array([polyfit(X[i],Y[i],deg) for i in xrange(obs.shape[0])])
我的问题是:有没有办法在不逐行处理的情况下,对这两个数组的每一行进行拟合呢?
2 个回答
0
是的,如果你使用新的 numpy polyfit,来自 np.polynomial
,而不是旧的 np.polyfit
:
X = np.arange(3)
Y = np.random.rand(10000, 3)
fit = np.array([np.polyfit(X, y, 2) for y in Y])
fits = np.polynomial.polynomial.polyfit(X, Y.T, 2)
assert np.allclose(fit.T[::-1], fits)
计时:
In [692]: timeit fit = np.array([np.polyfit(X, y, 2) for y in Y])
1 loops, best of 3: 2.22 s per loop
In [693]: timeit fits = np.polynomial.polynomial.polyfit(X, Y.T, 2)
100 loops, best of 3: 3.63 ms per loop
1
其实可以不沿着第一个轴来进行操作。不过,你的第二个轴比较短(只有3个元素),所以最多只能放下2个系数。
In [67]:
import numpy as np
import scipy.optimize as so
In [68]:
def MD_ployError(p, x, y):
'''if x has the shape of (n,m), y must be (n,m), p must be (n*p, ), where p is degree'''
#d is no. of degree
p_rshp=p.reshape((x.shape[0], -1))
f=y*1.
for i in range(p_rshp.shape[1]):
f-=p_rshp[:,i][:,np.newaxis]*(x**i)
return (f**2).sum()
In [69]:
X=np.random.random((100, 6))
Y=4+2*X+3*X*X
P=(np.zeros((100,3))+[1,1,1]).ravel()
In [70]:
MD_ployError(P, X, Y)
Out[70]:
11012.2067606684
In [71]:
R=so.fmin_slsqp(MD_ployError, P, args=(X, Y))
Iteration limit exceeded (Exit mode 9) #you can increase iteration limit, but the result is already good enough.
Current function value: 0.00243784856039
Iterations: 101
Function evaluations: 30590
Gradient evaluations: 101
In [72]:
R.reshape((100, -1))
Out[72]:
array([[ 3.94488512, 2.25402422, 2.74773571],
[ 4.00474864, 1.97966551, 3.02010015],
[ 3.99919559, 2.0032741 , 2.99753804],
..............................................)