将多通道PyAudio转换为NumPy数组

13 投票

1 回答

10651 浏览

提问于 2025-04-17 23:49

我找到的所有例子都是单声道的，设置了 CHANNELS = 1。我想知道如何使用回调方法在PyAudio中读取立体声或多声道输入，并把它转换成一个二维的NumPy数组或者多个一维数组。

对于单声道输入，像这样的方法是有效的：

def callback(in_data, frame_count, time_info, status):
    global result
    global result_waiting

    if in_data:
        result = np.fromstring(in_data, dtype=np.float32)
        result_waiting = True
    else:
        print('no input')

    return None, pyaudio.paContinue

stream = p.open(format=pyaudio.paFloat32,
                channels=1,
                rate=fs,
                output=False,
                input=True,
                frames_per_buffer=fs,
                stream_callback=callback)

但是对于立体声输入，这种方法就不行了，result数组的长度是原来的两倍，所以我猜测这些声道是交错在一起的，但我找不到相关的文档来确认这一点。

音频处理 pyaudio numpy数组回调方法多通道音频立体声

1 个回答

看起来这个数据是按样本交错的，左声道在前。左声道输入有信号，而右声道是静音时，我得到了：

result = [0.2776, -0.0002,  0.2732, -0.0002,  0.2688, -0.0001,  0.2643, -0.0003,  0.2599, ...

要把它分离成立体声流，需要把它重新整理成一个二维数组：

result = np.fromstring(in_data, dtype=np.float32)
result = np.reshape(result, (frames_per_buffer, 2))

现在要访问左声道，可以用 result[:, 0]，而右声道则用 result[:, 1]。

def decode(in_data, channels):
    """
    Convert a byte stream into a 2D numpy array with 
    shape (chunk_size, channels)

    Samples are interleaved, so for a stereo stream with left channel 
    of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output 
    is ordered as [L0, R0, L1, R1, ...]
    """
    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    result = np.fromstring(in_data, dtype=np.float32)

    chunk_length = len(result) / channels
    assert chunk_length == int(chunk_length)

    result = np.reshape(result, (chunk_length, channels))
    return result


def encode(signal):
    """
    Convert a 2D numpy array into a byte stream for PyAudio

    Signal should be a numpy array with shape (chunk_size, channels)
    """
    interleaved = signal.flatten()

    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    out_data = interleaved.astype(np.float32).tostring()
    return out_data

回答于 2025-04-17 由 Python大师

分享举报

将多通道PyAudio转换为NumPy数组

1 个回答

撰写回答