需要帮助理解这个Python-Viterbi算法吗

2条回答

网友

1楼 · 编辑于 2024-06-17 10:13:16

扩展版本如下所示：

all_probs = []

for j in range(max(0, i - max_word_length), i):
    all_probs.append((probs[j] * word_prob(text[j:i]), j))

prob_k, k = max(all_probs)

我希望这有助于解释。如果没有，可以编辑你的问题，并指出你不理解的陈述。在

网友

2楼 · 编辑于 2024-06-17 10:13:16

这里有一个工作的ruby实现，以防其他人也能使用它。我翻译了上面讨论的列表理解，我认为这是不可读的ruby的适当级别。在

def viterbi(text)
  probabilities = [1.0]
  lasts = [0]

  # Iterate over the letters in the compound.
  # eg. [h ellodarkness],[he llodarkness],...

  (1..(text.length + 1)).each do |i|
    prob_k, k = ([0, i - maximum_word_length].max...i).map { |j| [probabilities[j] * word_probability(text[j...i]), j] }.map { |s| s }.max_by(&:first)
    probabilities << prob_k
    lasts << k
  end

  words = []
  i = text.length
  while i.positive?
    words << text[lasts[i]...i]
    i = lasts[i]
  end
  words.reverse!
  [words, probabilities.last]
end

def word_probability(word)
  word_counts[word].to_f / word_counts_sum.to_f
end

def word_counts_sum
  @word_counts_sum ||= word_counts.values.sum.to_f
end

def maximum_word_length
  @maximum_word_length ||= word_counts.keys.map(&:length).max
end

def word_counts
  return @word_counts if @word_counts 
  @word_counts = {"hello" => 12, "darkness" => 6, "friend" => 79, "my" => 1, "old" => 5}
  @word_counts.default = 0
  @word_counts
end

puts "Best split is %s with probability %.6f" % viterbi("hellodarknessmyoldfriend")

=> Best split is ["hello", "darkness", "my", "old", "friend"] with probability 0.000002

主要的麻烦是python和ruby（open/closed interval）中不同的范围定义。这个算法非常快。在

使用可能性而不是概率可能会比较有利，因为重复的乘法可能会导致下溢和/或用较长的单词累积浮点错误。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

需要帮助理解这个Python-Viterbi算法吗

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >