从音频fi中提取基音特征

3条回答

网友

1楼 · 编辑于 2024-05-26 16:29:33

你可以尝试阅读有关音高检测的文献，这是相当广泛的。通常，基于自相关的方法似乎工作得很好；频域或过零方法的鲁棒性较差（因此FFT并没有太大帮助）。一个好的起点可能是实现以下两种算法之一：

YAAPT，发件人：Stephen A.Zahorian和Hongbing Hu，“鲁棒基频跟踪的谱时间方法”，声学杂志。社会福利。是。1234559（2008年）。http://bingweb.binghamton.edu/~hhu1/paper/Zahorian2008spectral.pdf和这里的MATLAB代码：http://ws2.binghamton.edu/zahorian/yaapt.htm
尹，来自：De Cheveigné，A.，Kawahara，H.“尹，语音和音乐的基本频率估计器”，声学杂志。社会福利。是。1111917-1930（2002）。http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf

至于现成的解决方案，请查看Aubio，使用python包装器的C代码，几种可用的基音提取算法，包括YIN和multiple comb。

网友

2楼 · 编辑于 2024-05-26 16:29:33

如果您愿意使用第三方库（至少作为其他人如何完成此任务的参考）：

从声音中提取音乐信息（PyCon 2012的演示文稿）展示了如何使用AudioNest Python API：

以下是相关的EchoNest文档：

Track API Methods
详细的Analyze Documentation

相关摘录：

pitch content is given by a “chroma” vector, corresponding to the 12 pitch classes C, C#, D to B, with values ranging from 0 to 1 that describe the relative dominance of every pitch in the chromatic scale. For example a C Major chord would likely be represented by large values of C, E and G (i.e. classes 0, 4, and 7). Vectors are normalized to 1 by their strongest dimension, therefore noisy sounds are likely represented by values that are all close to 1, while pure tones are described by one value at 1 (the pitch) and others near 0.

EchoNest在他们的服务器上进行分析。它们为非商业用途提供免费的API密钥。

如果EchoNest不是一个选项，我将查看开源的aubio project。它有python绑定，您可以检查源代码来查看they accomplished pitch detection。

网友

3楼 · 编辑于 2024-05-26 16:29:33

您可以将频率映射到音符：

$n=12\cdot\log_2(\frac{f}{C_p})+69$

当 $n$ 是要计算的midi音符数， $f$ 是频率， $C_p$ 是室内音调（在现代音乐中440.0Hz是常见的）。

正如你可能知道的那样，一个单一的频率并不能构成一个音高。”“音高”来源于对谐波的基本音的感觉，即主要由一个单一频率的整数倍组成的音（=基本音）。

如果您想在Python中使用Chroma特性，可以使用Bregman Audio-Visual Information Toolbox。注意，chroma特性并不能提供音高八度的信息，所以您只需获得有关pitch class的信息。

from bregman.suite import Chromagram
audio_file = "mono_file.wav"
F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205)
F.X # all chroma features
F.X[:,0] # one feature

从音频中提取音调信息的一般问题称为pitch detection。

相关问题更多 >

编程相关推荐

热门问题

热门文章