如何从马尔可夫链输出创建段落？问题的回答

如何从马尔可夫链输出创建段落？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我想修改下面的脚本，这样它就可以根据脚本生成的句子随机数创建段落。换言之，在添加新行之前，连接一个随机数目的句子（如1-5）。在 脚本工作正常，但输出的是用换行符分隔的短句。我想把一些句子整理成段落。在 对最佳实践有什么想法吗？谢谢。在 <pre><code>""" from: http://code.activestate.com/recipes/194364-the-markov-chain-algorithm/?in=lang-python """ import random; import sys; stopword = "\n" # Since we split on whitespace, this can never be a word stopsentence = (".", "!", "?",) # Cause a "new sentence" if found at the end of a word sentencesep = "\n" #String used to seperate sentences # GENERATE TABLE w1 = stopword w2 = stopword table = {} for line in sys.stdin: for word in line.split(): if word[-1] in stopsentence: table.setdefault( (w1, w2), [] ).<a href="https://www.cnpython.com/list/append" class="inner-link">append</a>(word[0:-1]) w1, w2 = w2, word[0:-1] word = word[-1] table.setdefault( (w1, w2), [] ).append(word) w1, w2 = w2, word # Mark the end of the file table.setdefault( (w1, w2), [] ).append(stopword) # GENERATE SENTENCE OUTPUT maxsentences = 20 w1 = stopword w2 = stopword sentencecount = 0 sentence = [] while sentencecount < maxsentences: newword = random.choice(table[(w1, w2)]) if newword == stopword: sys.exit() if newword in stopsentence: print ("%s%s%s" % (" ".join(sentence), newword, sentencesep)) sentence = [] sentencecount += 1 else: sentence.append(newword) w1, w2 = w2, newword </code></pre> <hr/> 编辑01: 好吧，我已经拼凑了一个简单的“段落包装器”，它可以很好地将句子集合成段落，但是它扰乱了句子生成器的输出——例如，在其他问题中，我得到了第一个单词的过度重复。在 但是前提是合理的；我只需要弄清楚为什么句子循环的功能会受到段落循环的影响。如果您能看到问题，请告知： ^{pr2}$ <hr/> 编辑02: 根据下面的答案将<code>sentence = []</code>添加到<code>elif</code>语句中。也就是说 <pre><code> elif newword in stopsentence: print ("%s%s" % (" ".join(sentence), newword), end=" ") sentence = [] # I have to be here to make the new sentence start as an empty list!!! sentencecount += 1 # increment the sentence counter </code></pre> <hr/> 编辑03: 这是这个脚本的最后一次迭代。感谢格里夫帮我解决这个问题。我希望其他人能从中得到一些乐趣，我知道我会的。；） 仅供参考：有一个小的工件-有一个额外的段落末尾空间，如果您使用这个脚本，您可能需要清理。但是，除此之外，马尔可夫链文本生成的完美实现。在 <pre><code>### # usage: python markov_sentences.py < input.txt > output.txt # from: http://code.activestate.com/recipes/194364-the-markov-chain-algorithm/?in=lang-python ### import random; import sys; stopword = "\n" # Since we split on whitespace, this can never be a word stopsentence = (".", "!", "?",) # Cause a "new sentence" if found at the end of a word sentencesep = "\n" #String used to seperate sentences # GENERATE TABLE w1 = stopword w2 = stopword table = {} for line in sys.stdin: for word in line.split(): if word[-1] in stopsentence: table.setdefault( (w1, w2), [] ).append(word[0:-1]) w1, w2 = w2, word[0:-1] word = word[-1] table.setdefault( (w1, w2), [] ).append(word) w1, w2 = w2, word # Mark the end of the file table.setdefault( (w1, w2), [] ).append(stopword) # GENERATE SENTENCE OUTPUT maxsentences = 20 w1 = stopword w2 = stopword sentencecount = 0 sentence = [] paragraphsep = "\n" count = random.randrange(1,5) while sentencecount < maxsentences: newword = random.choice(table[(w1, w2)]) # random word from word table if newword == stopword: sys.exit() if newword in stopsentence: print ("%s%s" % (" ".join(sentence), newword), end=" ") sentence = [] sentencecount += 1 # increment the sentence counter count -= 1 if count == 0: count = random.randrange(1,5) print (paragraphsep) # newline space else: sentence.append(newword) w1, w2 = w2, newword # EOF </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何从马尔可夫链输出创建段落？

1 个回答

相关Python问题