SciPy中二维插值的问题,非矩形网格
我一直在尝试使用scipy.interpolate.bisplrep()和scipy.interpolate.interp2d()来为我的(218x135)二维球极网格上的数据找到插值方法。为了这些函数,我传入了网格节点的二维数组X和Y,这些数组包含了它们在笛卡尔坐标系中的位置。可是我总是遇到一些错误,比如下面这个(使用interp2d进行线性插值时):
“警告:无法再添加更多的节点,因为新的节点会与旧的节点重合。可能的原因是:s值太小,或者给了不准确的数据点过大的权重。(fp>s)kx,ky=1,1 nx,ny=4,5 m=29430 fp=1390609718.902140 s=0.000000”
对于双变量样条插值,使用默认的平滑参数s时,我也得到了类似的结果。我的数据是平滑的。我把我的代码附在下面,以防我做错了什么明显的事情。
有没有什么建议?谢谢!
class Field(object):
Nr = 0
Ntheta = 0
grid = np.array([])
def __init__(self, Nr, Ntheta, f):
self.Nr = Nr
self.Ntheta = Ntheta
self.grid = np.empty([Nr, Ntheta])
for i in range(Nr):
for j in range(Ntheta):
self.grid[i,j] = f[i*Ntheta + j]
def calculate_lines(filename):
ri,ti,r,t,Br,Bt,Bphi,Bmag = np.loadtxt(filename, skiprows=3,\
usecols=(1,2,3,4,5,6,7,9), unpack=True)
Nr = int(max(ri)) + 1
Ntheta = int(max(ti)) + 1
### Initialise coordinate grids ###
X = np.empty([Nr, Ntheta])
Y = np.empty([Nr, Ntheta])
for i in range(Nr):
for j in range(Ntheta):
indx = i*Ntheta + j
X[i,j] = r[indx]*sin(t[indx])
Y[i,j] = r[indx]*cos(t[indx])
### Initialise field objects ###
Bradial = Field(Nr=Nr, Ntheta=Ntheta, f=Br)
### Interpolate the fields ###
intp_Br = interpolate.interp2d(X, Y, Bradial.grid, kind='linear')
#rbf_0 = interpolate.Rbf(X,Y, Bradial.grid, epsilon=2)
return
1 个回答
补充说明:Kyle在一个scipy-user 讨论串中跟进了这个问题。
30号更新:@Kyle,看起来在笛卡尔坐标系的X、Y和极坐标的Xnew、Ynew之间有些混淆。请查看下面过长的说明中的“polar”。
# griddata vs SmoothBivariateSpline
# http://stackoverflow.com/questions/3526514/
# problem-with-2d-interpolation-in-scipy-non-rectangular-grid
# http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data
# http://en.wikipedia.org/wiki/Natural_neighbor
# http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
from __future__ import division
import sys
import numpy as np
from scipy.interpolate import SmoothBivariateSpline # $scipy/interpolate/fitpack2.py
from matplotlib.mlab import griddata
__date__ = "2010-10-08 Oct" # plot diffs, ypow
# "2010-09-13 Sep" # smooth relative
def avminmax( X ):
absx = np.abs( X[ - np.isnan(X) ])
av = np.mean(absx)
m, M = np.nanmin(X), np.nanmax(X)
histo = np.histogram( X, bins=5, range=(m,M) ) [0]
return "av %.2g min %.2g max %.2g histo %s" % (av, m, M, histo)
def cosr( x, y ):
return 10 * np.cos( np.hypot(x,y) / np.sqrt(2) * 2*np.pi * cycle )
def cosx( x, y ):
return 10 * np.cos( x * 2*np.pi * cycle )
def dipole( x, y ):
r = .1 + np.hypot( x, y )
t = np.arctan2( y, x )
return np.cos(t) / r**3
#...............................................................................
testfunc = cosx
Nx = Ny = 20 # interpolate random Nx x Ny points -> Newx x Newy grid
Newx = Newy = 100
cycle = 3
noise = 0
ypow = 2 # denser => smaller error
imclip = (-5., 5.) # plot trierr, splineerr to same scale
kx = ky = 3
smooth = .01 # Spline s = smooth * z2sum, see note
# s is a target for sum (Z() - spline())**2 ~ Ndata and Z**2;
# smooth is relative, s absolute
# s too small => interpolate/fitpack2.py:580: UserWarning: ier=988, junk out
# grr error message once only per ipython session
seed = 1
plot = 0
exec "\n".join( sys.argv[1:] ) # run this.py N= ...
np.random.seed(seed)
np.set_printoptions( 1, threshold=100, suppress=True ) # .1f
print 80 * "-"
print "%s Nx %d Ny %d -> Newx %d Newy %d cycle %.2g noise %.2g kx %d ky %d smooth %s" % (
testfunc.__name__, Nx, Ny, Newx, Newy, cycle, noise, kx, ky, smooth)
#...............................................................................
# interpolate X Y Z to xnew x ynew --
X, Y = np.random.uniform( size=(Nx*Ny, 2) ) .T
Y **= ypow
# 1d xlin ylin -> 2d X Y Z, Ny x Nx --
# xlin = np.linspace( 0, 1, Nx )
# ylin = np.linspace( 0, 1, Ny )
# X, Y = np.meshgrid( xlin, ylin )
Z = testfunc( X, Y ) # Ny x Nx
if noise:
Z += np.random.normal( 0, noise, Z.shape )
# print "Z:\n", Z
z2sum = np.sum( Z**2 )
xnew = np.linspace( 0, 1, Newx )
ynew = np.linspace( 0, 1, Newy )
Zexact = testfunc( *np.meshgrid( xnew, ynew ))
if imclip is None:
imclip = np.min(Zexact), np.max(Zexact)
xflat, yflat, zflat = X.flatten(), Y.flatten(), Z.flatten()
#...............................................................................
print "SmoothBivariateSpline:"
fit = SmoothBivariateSpline( xflat, yflat, zflat, kx=kx, ky=ky, s = smooth * z2sum )
Zspline = fit( xnew, ynew ) .T # .T ??
splineerr = Zspline - Zexact
print "Zspline - Z:", avminmax(splineerr)
print "Zspline: ", avminmax(Zspline)
print "Z: ", avminmax(Zexact)
res = fit.get_residual()
print "residual %.0f res/z2sum %.2g" % (res, res / z2sum)
# print "knots:", fit.get_knots()
# print "Zspline:", Zspline.shape, "\n", Zspline
print ""
#...............................................................................
print "griddata:"
Ztri = griddata( xflat, yflat, zflat, xnew, ynew )
# 1d x y z -> 2d Ztri on meshgrid(xnew,ynew)
nmask = np.ma.count_masked(Ztri)
if nmask > 0:
print "info: griddata: %d of %d points are masked, not interpolated" % (
nmask, Ztri.size)
Ztri = Ztri.data # Nans outside convex hull
trierr = Ztri - Zexact
print "Ztri - Z:", avminmax(trierr)
print "Ztri: ", avminmax(Ztri)
print "Z: ", avminmax(Zexact)
print ""
#...............................................................................
if plot:
import pylab as pl
nplot = 2
fig = pl.figure( figsize=(10, 10/nplot + .5) )
pl.suptitle( "Interpolation error: griddata - %s, BivariateSpline - %s" % (
testfunc.__name__, testfunc.__name__ ), fontsize=11 )
def subplot( z, jplot, label ):
ax = pl.subplot( 1, nplot, jplot )
im = pl.imshow(
np.clip( z, *imclip ), # plot to same scale
cmap=pl.cm.RdYlBu,
interpolation="nearest" )
# nearest: squares, else imshow interpolates too
# todo: centre the pixels
ny, nx = z.shape
pl.scatter( X*nx, Y*ny, edgecolor="y", s=1 ) # for random XY
pl.xlabel(label)
return [ax, im]
subplot( trierr, 1,
"griddata, Delaunay triangulation + Natural neighbor: max %.2g" %
np.nanmax(np.abs(trierr)) )
ax, im = subplot( splineerr, 2,
"SmoothBivariateSpline kx %d ky %d smooth %.3g: max %.2g" % (
kx, ky, smooth, np.nanmax(np.abs(splineerr)) ))
pl.subplots_adjust( .02, .01, .92, .98, .05, .05 ) # l b r t
cax = pl.axes([.95, .05, .02, .9]) # l b w h
pl.colorbar( im, cax=cax ) # -1.5 .. 9 ??
if plot >= 2:
pl.savefig( "tmp.png" )
pl.show()
关于二维插值的笔记,BivariateSpline与griddata的比较。
scipy.interpolate.*BivariateSpline
和matplotlib.mlab.griddata
都需要一维数组作为输入:
Znew = griddata( X,Y,Z, Xnew,Ynew )
# 1d X Y Z Xnew Ynew -> interpolated 2d Znew on meshgrid(Xnew,Ynew)
assert X.ndim == Y.ndim == Z.ndim == 1 and len(X) == len(Y) == len(Z)
输入的X,Y,Z
描述了三维空间中的一个表面或点云:
X,Y
(或者说是经纬度等)是在一个平面上的点,
而Z
则是这个平面上方的表面或地形。
X,Y
可以填满大部分矩形区域[Xmin .. Xmax] x [Ymin .. Ymax],
也可能只是一个在里面的曲线S或Y。
Z
的表面可能是光滑的,或者光滑加上一点噪声,
也可能根本不光滑,比如粗糙的火山山脉。
Xnew和Ynew通常也是一维的,描述了一个|Xnew| x |Ynew|的矩形网格,
在这个网格上你想要插值或估算Z。
Znew = griddata(...)会返回这个网格上的二维数组,np.meshgrid(Xnew,Ynew):
Znew[Xnew0,Ynew0], Znew[Xnew1,Ynew0], Znew[Xnew2,Ynew0] ...
Znew[Xnew0,Ynew1] ...
Znew[Xnew0,Ynew2] ...
...
如果Xnew和Ynew的点离输入的X,Y点太远,就会出现问题。
griddata
会检查这一点:
如果任何网格点在输入数据定义的凸包之外,就会返回一个被屏蔽的数组(不会进行外推)。
(“凸包”是一个想象中的橡皮筋,拉伸在所有X,Y点周围的区域。)
griddata
的工作原理是首先构建输入X,Y的德劳内三角剖分,然后进行自然邻居插值。这种方法既稳健又相当快速。
而BivariateSpline可以进行外推,
可能会产生剧烈的波动而没有任何警告。
此外,Fitpack中的所有*Spline例程对平滑参数S非常敏感。
Dierckx的书(books.google isbn 019853440X 第89页)提到:
如果S太小,样条曲线的近似会过于波动,
并且会捕捉到过多的噪声(过拟合);
如果S太大,样条曲线会过于平滑,
信号会丢失(欠拟合)。
散乱数据的插值很困难,平滑也不容易,两者结合起来就更难了。 插值器应该如何处理XY中的大空洞,或者非常嘈杂的Z呢? (“如果你想卖出去,你得描述清楚。”)
还有更多的笔记,细节:
1d与2d:一些插值器可以接受X,Y,Z作为一维或二维。 其他的只接受一维,所以在插值之前需要先扁平化:
Xmesh, Ymesh = np.meshgrid( np.linspace(0,1,Nx), np.linspace(0,1,Ny) )
Z = f( Xmesh, Ymesh ) # Nx x Ny
Znew = griddata( Xmesh.flatten(), Ymesh.flatten(), Z.flatten(), Xnew, Ynew )
关于被屏蔽的数组:matplotlib处理得很好, 只绘制未屏蔽的/非NaN的点。 但我不敢保证某些numpy/scipy函数会正常工作。 检查插值是否在X,Y的凸包之外,可以这样做:
Znew = griddata(...)
nmask = np.ma.count_masked(Znew)
if nmask > 0:
print "info: griddata: %d of %d points are masked, not interpolated" % (
nmask, Znew.size)
# Znew = Znew.data # array with NaNs
关于极坐标:
X,Y和Xnew,Ynew应该在同一个空间中,
要么都是笛卡尔坐标,要么都是在[rmin .. rmax] x [tmin .. tmax]中。
要在三维中绘制(r, theta, z)点:
from mpl_toolkits.mplot3d import Axes3D
Znew = griddata( R,T,Z, Rnew,Tnew )
ax = Axes3D(fig)
ax.plot_surface( Rnew * np.cos(Tnew), Rnew * np.sin(Tnew), Znew )
另见(我还没试过这个):
ax = subplot(1,1,1, projection="polar", aspect=1.)
ax.pcolormesh(theta, r, Z)
给谨慎的程序员两个建议:
检查异常值,或者奇怪的缩放:
def minavmax( X ):
m = np.nanmin(X)
M = np.nanmax(X)
av = np.mean( X[ - np.isnan(X) ]) # masked ?
histo = np.histogram( X, bins=5, range=(m,M) ) [0]
return "min %.2g av %.2g max %.2g histo %s" % (m, av, M, histo)
for nm, x in zip( "X Y Z Xnew Ynew Znew".split(),
(X,Y,Z, Xnew,Ynew,Znew) ):
print nm, minavmax(x)
用简单的数据检查插值:
interpolate( X,Y,Z, X,Y ) -- interpolate at the same points
interpolate( X,Y, np.ones(len(X)), Xnew,Ynew ) -- constant 1 ?