擅长:python、mysql、java
<p>我目前正在做一个项目,我使用静音和mfcc系数进行音频切割,我留下了我的解决方案:</p>
<pre><code>import pydub
import python_speech_features as p
import numpy as np
def generate_mfcc_without_silences(path):
#get audio and change frame rate to 16KHz
audio_file = pydub.AudioSegment.from_wav(path)
audio_file = audio_file.set_frame_rate(16000)
#cut audio using silences
chunks = pydub.silence.split_on_silence(audio_file, silence_thresh=audio_file.dBFS, min_silence_len=200)
mfccs = []
for chunk in chunks:
#compute mfcc from chunk array
np_chunk = np.frombuffer(chunk.get_array_of_samples(), dtype=np.int16)
mfccs.append(p.mfcc(np_chunk, samplerate=audio_file.frame_rate, numcep=26))
return mfccs
</code></pre>
<p>注意事项:</p>
<p>·我将音频更改为16KHz,但这是可选的</p>
<p>·我将min_silence_len的值设为200,因为我想尝试获取单个单词</p>
<p>根据我的功能和您的要求,您需要的功能可能是:</p>
^{pr2}$