如何在Python中平滑曲线而不在边界处产生误差?
考虑一下与两个numpy数组 x
和 y
相关的曲线:
在Python中,如何正确地平滑这条曲线,以避免在 xmax
附近出现问题?(如果我使用高斯滤波,曲线在末尾会抬高)
数据在这里(两列): http://lite4.framapad.org/p/xqhpGJpV5R
2 个回答
7
如果你的所有数据在对数空间中变化得很慢,我会这样做:
- 在一个线性尺度上大幅度减少对数数据的采样
- 计算一个平滑的样条曲线
- 再转换回线性尺度
例如:
import numpy as np
from scipy.interpolate import interp1d, splrep, splev
import pylab
x = np.log10(x)
y = np.log10(y)
ip = interp1d(x,y)
xi = np.linspace(x.min(),x.max(),10)
yi = ip(xi)
tcl = splrep(xi,yi,s=1)
xs = np.linspace(x.min(), x.max(), 100)
ys = splev(xs, tcl)
xi = np.power(10,xi)
yi = np.power(10,yi)
xs = np.power(10,xs)
ys = np.power(10,ys)
f = pylab.figure()
pl = f.add_subplot(111)
pl.loglog(aset.x,aset.y,alpha=0.4)
pl.loglog(xi,yi,'go--',linewidth=1, label='linear')
pl.loglog(xs,ys,'r-',linewidth=1, label='spline')
pl.legend(loc=0)
f.show()
这样做的结果是:
5
最简单的方法是在过滤信号之前先去掉趋势。你看到的边缘效应主要是因为信号不是静态的(也就是说,它有一个斜率)。
首先,我们来演示一下这个问题:
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_filter1d
x, y = np.loadtxt('spectrum.dat').T
# Smooth with a guassian filter
smooth = gaussian_filter1d(y, 10)
fig, ax = plt.subplots()
ax.loglog(x, y, color='black')
ax.loglog(x, smooth, color='red')
plt.show()
哎呀!边缘效应在数据的末尾(右边)特别明显,因为那里斜率最陡。如果开始的地方斜率更陡,你也会在那看到更强的边缘效应。
好消息是,有很多方法可以纠正这个问题。@ChristianK.的回答展示了如何使用平滑样条有效地进行低通滤波。我将用其他一些信号处理方法来实现同样的效果。哪种方法“最好”完全取决于你的需求。平滑样条方法很简单。而使用“更复杂”的信号处理方法可以让你更精确地控制过滤掉哪些频率。
你的数据在对数-对数空间中看起来像一个抛物线,所以我们先用二次多项式在对数-对数空间中去掉趋势,然后再应用滤波。
作为一个快速示例:
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_filter1d
x, y = np.loadtxt('spectrum.dat').T
# Let's detrend by fitting a second-order polynomial in log space
# (Note that your data looks like a parabola in log-log space.)
logx, logy = np.log(x), np.log(y)
model = np.polyfit(logx, logy, 2)
trend = np.polyval(model, logx)
# Smooth with a guassian filter
smooth = gaussian_filter1d(logy - trend, 10)
# Add the trend back in and convert back to linear space
smooth = np.exp(smooth + trend)
fig, ax = plt.subplots()
ax.loglog(x, y, color='black')
ax.loglog(x, smooth, color='red')
plt.show()
注意,我们仍然有一些边缘效应。这是因为我使用的高斯滤波器会导致相位偏移。如果我们真的想要更复杂一点,我们可以先去掉趋势,然后使用零相位滤波器来进一步减少边缘效应。
import numpy as np
import matplotlib.pyplot as plt
import scipy.signal as signal
def main():
x, y = np.loadtxt('spectrum.dat').T
logx, logy = np.log(x), np.log(y)
smooth_log = detrend_zero_phase(logx, logy)
smooth = np.exp(smooth_log)
fig, ax = plt.subplots()
ax.loglog(x, y, 'k-')
ax.loglog(x, smooth, 'r-')
plt.show()
def zero_phase(y):
# Low-pass filter...
b, a = signal.butter(3, 0.05)
# Filtfilt applies the filter twice to avoid phase shifts.
return signal.filtfilt(b, a, y)
def detrend_zero_phase(x, y):
# Fit a second order polynomial (Can't just use scipy.signal.detrend here,
# because we need to know what the trend is to add it back in.)
model = np.polyfit(x, y, 2)
trend = np.polyval(model, x)
# Apply a zero-phase filter to the detrended curve.
smooth = zero_phase(y - trend)
# Add the trend back in
return smooth + trend
main()