Python语音识别库 - 始终监听？

5 投票

4 回答

28817 浏览

数据工程师

提问于 2025-04-18 17:51

我最近在用Python的语音识别库来启动应用程序。我的最终目标是用这个库来实现语音控制的家庭自动化，使用树莓派的GPIO接口。

我已经让这个功能正常工作了，它可以识别我的声音并启动应用程序。不过，问题是它似乎会一直重复我说的那个词（比如，我说“互联网”，它就会无限次地启动Chrome）。

这种情况跟我看到的while循环的表现不太一样。我搞不清楚怎么才能停止这个循环。是不是需要在循环外做点什么才能让它正常工作？请看下面的代码。

http://pastebin.com/auquf1bR

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
        audio = r.listen(source)

def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

循环控制代码调试应用程序启动树莓派语音识别家庭自动化语音控制 GPIO接口

4 个回答

这听起来有点遗憾，但你需要在每次循环中都初始化麦克风。因为这个模块总是会使用 r.adjust_for_ambient_noise(source) 这个功能，它可以确保在嘈杂的环境中也能听懂你的声音。设置声音阈值需要时间，如果你不停地发出指令，可能会漏掉你说的一些话。

import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction():
        with sr.Microphone() as source:
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
        user = r.recognize(audio)
        print(user)
        if user == "Excel":
                excel()
        elif user == "Internet":
                internet()
        elif user == "music":
                media()
while 1:
        mainfunction()

回答于 2025-04-18 由 Python大师

分享举报

我在这个话题上花了很多时间。

目前，我正在开发一个叫做 Athena Voice 的开源跨平台虚拟助手程序，使用的是 Python 3：https://github.com/athena-voice/athena-voice-client

用户可以像使用 Siri、Cortana 或亚马逊的 Echo 一样使用它。

这个程序还采用了一个非常简单的“模块”系统，用户可以轻松编写自己的模块来增强它的功能。如果你觉得这有用，可以告诉我。

如果不需要的话，我建议你可以看看 Pocketsphinx 和谷歌的 Python 语音识别/语音合成包。

在 Python 3.4 上，可以通过以下命令安装 Pocketsphinx：

pip install pocketsphinx

不过，你需要单独安装 PyAudio 这个依赖（非官方下载）：http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

这两个谷歌的包可以通过以下命令安装：

pip install SpeechRecognition gTTS

谷歌语音识别（STT）：https://pypi.python.org/pypi/SpeechRecognition/

谷歌语音合成（TTS）：https://pypi.python.org/pypi/gTTS/1.0.2

Pocketsphinx 应该用于离线的唤醒词识别，而谷歌的语音识别则适合主动监听。

回答于 2025-04-18 由 Python大师

分享举报

为了确保你能理解，这里有一个关于如何在pocketsphinx中持续监听关键词的例子。这种方法比不断将音频发送到谷歌要简单得多。而且你可以得到一个更灵活的解决方案。

import sys, os, pyaudio
from pocketsphinx import *

modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)

decoder = Decoder(config)
decoder.start_utt('spotting')

stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()        

while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
        print "Detected keyword, restarting search"
        decoder.end_utt()
        decoder.start_utt('spotting')

回答于 2025-04-18 由 Python大师

分享举报

问题在于，你只在程序开始时听了一次语音，然后只是不断地对同一段保存的音频调用 recognize。你需要把实际监听语音的代码放到 while 循环里面：

import pyaudio,os
import speech_recognition as sr


def excel():
        os.system("start excel.exe")

def internet():
        os.system("start chrome.exe")

def media():
        os.system("start wmplayer.exe")

def mainfunction(source):
    audio = r.listen(source)
    user = r.recognize(audio)
    print(user)
    if user == "Excel":
        excel()
    elif user == "Internet":
        internet()
    elif user == "music":
        media()

if __name__ == "__main__":
    r = sr.Recognizer()
    with sr.Microphone() as source:
        while 1:
            mainfunction(source)

回答于 2025-04-18 由 Python大师

分享举报

Python语音识别库 - 始终监听？

4 个回答

撰写回答