我试图重现youtube视频https://www.youtube.com/watch?v=g-sndkf7mCs中看到的预处理过程,他们创建了一个20毫秒窗口的频谱图,然后对其应用FFT。最后,他们将获得的光谱图输入神经网络。我使用的是scipy包,但是我对要使用的参数有点困惑。代码如下:
def get_spectrogram(path, nsamples=16000):
'''
Given path, return specgram.
'''
# read the wav files
wav = wavfile.read(path)[1] # 16000 samples per second
# zero pad the shorter samples and cut off the long ones to have a signal of 1 sec.
if wav.size < nsamples:
d = np.pad(wav, (nsamples - wav.size, 0), mode='constant')
else:
d = wav[0:nsamples]
# get the specgram
specgram = signal.spectrogram(d, fs= ? , nperseg=None, noverlap=None, nfft=None)[2]
return specgram
此外,我还想知道输出的形状是什么?是(X,1)?在
目前没有回答
相关问题 更多 >
编程相关推荐