你能在Pocketsphinx中为多个文件使用相同的解码器吗?

2024-04-18 11:25:52 发布

您现在位置:Python中文网/ 问答频道 /正文

在Pocketsphinx(Python)中,是否可以对多个wav文件使用相同的解码器?我有下面的代码片段,这是非常标准的,除了我在同一个文件上调用解码器两次。然而,输出是不一样的。我也尝试过在不同的文件上使用解码器两次,并且根据调用文件的顺序,输出也会有所不同-第一个文件正确解码,但第二个文件解码不正确。此外,只有当第一个文件有一些输出时才会发生这种情况——如果第一个文件没有任何单词,那么第二个文件就可以解码了。这使我相信在解码一个文件后,解码器以某种方式被修改了。我说的对吗?有没有什么方法可以重置解码器,或者通常让它适用于多个文件?似乎这里应该给出一个例子:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py。在

config = ps.Decoder.default_config()    
config.set_string('-hmm', os.path.join(MODELDIR, 'en-US/acoustic-model'))
config.set_string('-lm', os.path.join(MODELDIR, 'en-US/language-model.lm.bin'))
config.set_string('-dict', os.path.join(MODELDIR, 'en-US/pronounciation-dictionary.dict'))
config.set_string('-logfn', 'pocketsphinxlog')
decoder = ps.Decoder(config)

wavname16_1 =  os.path.join(DATADIR, 'arctic_a0001.wav')
# Decode streaming data.
decoder.start_utt()
stream = open(wavname16_1, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print words

wavname16_2 =  os.path.join(DATADIR, 'arctic_a0002.wav')
decoder.start_utt()
stream = open(wavname16_2, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print "arctic2: " + words

编辑-更多信息:

如果arctic_a001.wav是http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0001.wav,则arctic_a002.wav是http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0002.wav,字典是单行:

^{pr2}$

则电流输出为:

arctic1: [('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
arctic2: [('<s>', -3), ('[SPEECH]', -725), ('<sil>', -1), ('[SPEECH]', -6), ('<sil>', -20), ('of', -6162), ('[SPEECH]', -397), ('</s>', 0)]

但是如果我们切换它们,输出变成

arctic2: [('<s>', 0), ('of', 0), ('<sil>', 0), ('of', -29945), ('<sil>', -20), ('of', -26004), ('of', 0), ('of', 0), ('<sil>', 0), ('of', -84868), ('of', -35690), ('</s>', 0)]
arctic1: [('<s>', -3), ('of', -14886), ('of', -30237), ('<sil>', 0), ('of', -22103), ('of', 1), ('<sil>', 0), ('of', -30795), ('of', -65040), ('</s>', 0)]

因此,arctic1和arctic2的输出取决于顺序。此外,如果我们使用arctic1两次,则输出为

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', 1), ('of', -24424), ('of', -24554), ('<sil>', 2), ('[SPEECH]', -37257), ('of', -37008), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

可能是我没有使用start_stream()的问题?我不知道该如何使用它。即使我使用decoder.start_流()(直接在前面解码器启动()),输出是不同的-它变成

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', -2), ('of', -33113), ('of', -29715), ('<sil>', 1), ('[SPEECH]', -37258), ('of', -37009), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

如果您想要完整的日志,这里(http://pastebin.com/2dNeyS1x)是arctic1在arctic2之前的日志,这里(http://pastebin.com/Nkvj2G0g)是arctic2在arctic1之前的日志,而这里是arctic1的日志,在一行中使用start_stream(http://pastebin.com/HWq6j7X2)显示两次arctic1的日志,而这里是arctic1在没有start_stream(http://pastebin.com/MsadW4nh)的行中两次的日志。在


Tags: 文件ofconfigstream解码器startspeechwav
1条回答
网友
1楼 · 发布于 2024-04-18 11:25:52

Is it possible to use the same decoder for multiple wav files in Pocketsphinx (Python)?

是的

I have the following code snippet, which is very standard, except that I call the decoder twice on the same file. The outputs are not the same, however.

您需要为第二个文件调用decoder.start_stream()来重置解码器计时。在

I've also tried using the decoder twice on different files, and the outputs are different depending on the order in which I call the files - the first file decodes correctly, but the second file does not decode correctly. Furthermore, this only happens if there is some output from the first file - if the first file doesn't have any words, then the second file decodes fine.

嗯,可能有不同的因素影响结果。没有例子很难说。您最好提供示例文件和有问题的输出,以获得这个问题的答案。在

相关问题 更多 >