在Python中可逆STFT和ISTFT

2024-05-20 01:52:37 发布

您现在位置:Python中文网/ 问答频道 /正文

有没有任何通用形式的short-time Fourier transform与相应的反变换一起内置到SciPy或NumPy或其他东西中?

matplotlib中有pyplotspecgram函数,它调用ax.specgram(),它调用mlab.specgram(),它调用^{}

#The checks for if y is x are so that we can use the same function to
#implement the core of psd(), csd(), and spectrogram() without doing
#extra calculations.  We return the unaveraged Pxy, freqs, and t.

但是

This is a helper function that implements the commonality between the 204 #psd, csd, and spectrogram. It is NOT meant to be used outside of mlab

不过,我不确定这是否可以用于STFT和ISTFT。还有什么,或者我应该翻译一些像these MATLAB functions这样的东西吗?

我知道如何编写自己的特别实现;我只是在寻找一些功能齐全的东西,它可以处理不同的窗口功能(但有一个合理的默认值),完全可逆的可乐窗口(istft(stft(x))==x),由多人测试,没有一个错误关闭,处理好结束和零填充,实时输入的快速RFFT实现等


Tags: andofthetothattimeisfunction
3条回答

我来晚了一点,但我意识到scipy在0.19.0中内置了istft函数

这是我使用的STFT代码。这里的STFT+ISTFT给出了完美的重建(即使是第一帧)。我稍微修改了Steve Tjoa给出的代码:这里重构信号的大小与输入信号的大小相同。

import scipy, numpy as np

def stft(x, fftsize=1024, overlap=4):   
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]      # better reconstruction with this trick +1)[:-1]  
    return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)])

def istft(X, overlap=4):   
    fftsize=(X.shape[1]-1)*2
    hop = fftsize / overlap
    w = scipy.hanning(fftsize+1)[:-1]
    x = scipy.zeros(X.shape[0]*hop)
    wsum = scipy.zeros(X.shape[0]*hop) 
    for n,i in enumerate(range(0, len(x)-fftsize, hop)): 
        x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w   # overlap-add
        wsum[i:i+fftsize] += w ** 2.
    pos = wsum != 0
    x[pos] /= wsum[pos]
    return x

下面是我的Python代码,针对这个答案进行了简化:

import scipy, pylab

def stft(x, fs, framesz, hop):
    framesamp = int(framesz*fs)
    hopsamp = int(hop*fs)
    w = scipy.hanning(framesamp)
    X = scipy.array([scipy.fft(w*x[i:i+framesamp]) 
                     for i in range(0, len(x)-framesamp, hopsamp)])
    return X

def istft(X, fs, T, hop):
    x = scipy.zeros(T*fs)
    framesamp = X.shape[1]
    hopsamp = int(hop*fs)
    for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)):
        x[i:i+framesamp] += scipy.real(scipy.ifft(X[n]))
    return x

注:

  1. 列表理解是一个小技巧,我喜欢用它来模拟numpy/scipy中的信号块处理。就像在Matlab中blkproc。与for循环不同,我对列表理解中的每个信号帧应用一个命令(例如fft),然后scipy.array将其转换为2D数组。我用它来制作光谱图,色度图,MFCC图等等。
  2. 在这个例子中,我在istft中使用了一个简单的overlap和add方法。为了重建原始信号,序列窗口函数的和必须是常数,最好等于单位(1.0)。在本例中,我选择了Hann(或hanning)窗口和一个50%重叠的窗口,它工作得很好。有关详细信息,请参见this discussion
  3. 可能有更多的计算ISTFT的原则方法。这个例子主要是为了教育。

测试:

if __name__ == '__main__':
    f0 = 440         # Compute the STFT of a 440 Hz sinusoid
    fs = 8000        # sampled at 8 kHz
    T = 5            # lasting 5 seconds
    framesz = 0.050  # with a frame size of 50 milliseconds
    hop = 0.025      # and hop size of 25 milliseconds.

    # Create test signal and STFT.
    t = scipy.linspace(0, T, T*fs, endpoint=False)
    x = scipy.sin(2*scipy.pi*f0*t)
    X = stft(x, fs, framesz, hop)

    # Plot the magnitude spectrogram.
    pylab.figure()
    pylab.imshow(scipy.absolute(X.T), origin='lower', aspect='auto',
                 interpolation='nearest')
    pylab.xlabel('Time')
    pylab.ylabel('Frequency')
    pylab.show()

    # Compute the ISTFT.
    xhat = istft(X, fs, T, hop)

    # Plot the input and output signals over 0.1 seconds.
    T1 = int(0.1*fs)

    pylab.figure()
    pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1])
    pylab.xlabel('Time (seconds)')

    pylab.figure()
    pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:])
    pylab.xlabel('Time (seconds)')

STFT of 440 Hz sinusoidISTFT of beginning of 440 Hz sinusoidISTFT of end of 440 Hz sinusoid

相关问题 更多 >