3种python库中的MFCC和delta系数

2024-04-25 22:16:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我最近做了关于MFCC的功课,我不知道使用这些库之间有什么区别。在

我使用的3个库是:

python_speech_features

SpeechPy

LibROSA

samplerate = 16000
NFFT = 512
NCEPT = 13

第一部分:Mel过滤器组

^{pr2}$

pic1

Only the shape in speechpy will get (, 512), other all (, 257). The figure of librosa is a bit of deformation.

第二部分:MFCC

# pyspeech without lifter. Using hamming
temp1_mfcc = pyspeech.mfcc(speaker1, samplerate=sample1, winlen=0.025, winstep=0.01, numcep=NCEPT, nfilt=NFILT, nfft=NFFT,
                           preemph=0.97, ceplifter=0, winfunc=np.hamming, appendEnergy=False)
# speechpy need pre-emphasized. Using rectangular window fixed. Mel filter bank is not the same
temp2_mfcc = speechpy.feature.mfcc(emphasized_speaker1, sampling_frequency=sample1, frame_length=0.025, frame_stride=0.01,
                                   num_cepstral=NCEPT, num_filters=NFILT, fft_length=NFFT)
# librosa need pre-emphasized. Using log energy. Its STFT using hanning, but its framing is not the same
temp3_energy = librosa.feature.melspectrogram(emphasized_speaker1, sr=sample1, S=temp3_pow.T, n_fft=NFFT,
                                          hop_length=frame_step, n_mels=NFILT).T
temp3_energy = np.log(temp3_energy)
temp3_mfcc = librosa.feature.mfcc(emphasized_speaker1, sr=sample1, S=temp3_energy.T, n_mfcc=13, dct_type=2, n_fft=NFFT,
                                  hop_length=frame_step).T

pic2

I've tried my best to set the condition faire. The figure of speechpy gets darker.

第三部分:三角洲系数

temp1 = pyspeech.delta(mfcc_speaker1, 2)
temp2 = speechpy.processing.derivative_extraction(mfcc_speaker1.T, 1).T
# librosa along the frame axis
temp3 = librosa.feature.delta(mfcc_speaker1, width=5, axis=0, order=1)

pic3

I can't directly set mfcc as argument in speechpy, or it will be very strange. And what these parameters originally act is not the same as my expected.

我想知道是什么因素造成了这些差异。只是我上面提到的吗?还是我犯了些错误?希望了解详情,谢谢。在


Tags: theisframelengthfeatureenergylibrosanfft
1条回答
网友
1楼 · 发布于 2024-04-25 22:16:08

MFCC的实现有很多种,而且它们常常一点一点地不同—窗口函数形状、mel滤波器组计算、dct也可能不同。很难找到完全兼容的库。从长远来看,只要您在任何地方都使用相同的实现,这对您来说不重要。这些差异并不影响结果。在

相关问题 更多 >

    热门问题