如何转换语音.csv数据转换成音频wav格式?

2024-05-16 20:45:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用语音.csv数据转换成音频wav形式,但它有21列,在我的代码中,它只需要两个值(列),所以我应该在我的代码中作出什么改变,以创建一个wav形式。你知道吗

数据来自https://www.kaggle.com/primaryobjects/voicegender,其中也有数据描述:

The following acoustic properties of each voice are measured and included within the CSV:

  • meanfreq: mean frequency (in kHz)
  • sd: standard deviation of frequency
  • median: median frequency (in kHz)
  • Q25: first quantile (in kHz)
  • Q75: third quantile (in kHz)
  • IQR: interquantile range (in kHz)
  • skew: skewness (see note in specprop description)
  • kurt: kurtosis (see note in specprop description)
  • sp.ent: spectral entropy
  • sfm: spectral flatness
  • mode: mode frequency
  • centroid: frequency centroid (see specprop)
  • peakf: peak frequency (frequency with highest energy)
  • meanfun: average of fundamental frequency measured across acoustic signal
  • minfun: minimum fundamental frequency measured across acoustic signal
  • maxfun: maximum fundamental frequency measured across acoustic signal
  • meandom: average of dominant frequency measured across acoustic signal
  • mindom: minimum of dominant frequency measured across acoustic signal
  • maxdom: maximum of dominant frequency measured across acoustic signal
  • dfrange: range of dominant frequency measured across acoustic signal
  • modindx: modulation index. Calculated as the accumulated absolute difference between adjacent measurements of fundamental frequencies divided by the frequency range
  • label: male or female

我的代码需要两列输入。我试图将这些列转换为时间和频率,我尝试跳过多个列,但没有得到我想要的结果。你知道吗

import wave
import struct
import sys
import csv
import numpy 
from scipy.io import wavfile
from scipy.signal import resample


def write_wav(data, filename, framerate, amplitude):
wavfile = wave.open(filename,'w')
nchannels = 1
sampwidth = 2
framerate = framerate
nframes = len(data)
comptype = "NONE"
compname = "not compressed"
wavfile.setparams((nchannels,
                    sampwidth,
                    framerate,
                    nframes,
                    comptype,
                    compname))
frames = []
for s in data:
    mul = int(s * amplitude)
    frames.append(struct.pack('h', mul))

frames = ''.join(frames)
wavfile.writeframes(frames)
wavfile.close()
print("%s written" %(filename))


if __name__ == "__main__":
if len(sys.argv) <= 1:
    print ("You must supply a filename to generate")
    exit(-1)
for fname in sys.argv[1:]:

    data = []
    for time, value in csv.reader(open('voice.csv'), delimiter=','):
        try:
            data.append(float(value))#Here you can see that the time column is skipped
        except ValueError:
            pass # Just skip it


    arr = numpy.array(data)#Just organize all your samples into an array
    # Normalize data
    arr /= numpy.max(numpy.abs(data)) #Divide all your samples by the max sample value
    filename_head, extension = fname.rsplit(',',1)        
    data_resampled = resample( arr, len(data) )
    wavfile.write('rec.wav', 16000, data_resampled) #resampling at 16khz
    print ("File written succesfully !")






ValueError                                Traceback (most recent call 
last)
<ipython-input-10-ad8c56a24b4d> in <module>
  6 
  7         data = []
----> 8         for time, value in csv.reader(open('voice.csv'), 
delimiter=','):
  9             try:
 10                 data.append(float(value))#Here you can see that the 
time column is skipped

ValueError: too many values to unpack (expected 2)

Tags: ofcsvtheinimportdatasignalfilename
1条回答
网友
1楼 · 发布于 2024-05-16 20:45:09

该文件包含统计聚合,而不是实际的音频数据。你不可能仅仅从这些整体测量中逆向工程出一个可靠的音频信号。你知道吗

在这么多的话,这就像是试图创建一个地形剖面之间的两点,只是从距离和时间旅行。额外的测量数据,如高度差或一段时间内的平均加速度,将大大限制可能的猜测数量,但你仍然猜测,疯狂。你知道吗

相关问题 更多 >