Python：谱图语音识别

2024-04-24 09:02:03 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图重现youtube视频https://www.youtube.com/watch?v=g-sndkf7mCs中看到的预处理过程，他们创建了一个20毫秒窗口的频谱图，然后对其应用FFT。最后，他们将获得的光谱图输入神经网络。我使用的是scipy包，但是我对要使用的参数有点困惑。代码如下：

def get_spectrogram(path, nsamples=16000):
    '''
    Given path, return specgram.
    '''
    # read the wav files
    wav = wavfile.read(path)[1] # 16000 samples per second

    # zero pad the shorter samples and cut off the long ones to have a signal of 1 sec.
    if wav.size < nsamples:
        d = np.pad(wav, (nsamples - wav.size, 0), mode='constant')
    else:
        d = wav[0:nsamples]

    # get the specgram
    specgram = signal.spectrogram(d, fs= ? , nperseg=None, noverlap=None, nfft=None)[2]

    return specgram

此外，我还想知道输出的形状是什么？是（X，1）？在

Tags： the path none read size get signal return

0条回答

目前没有回答

Python：谱图语音识别

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python：谱图语音识别

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >