<p>你也可以用<code>nltk</code>来接近它,它是<a href="http://www.nltk.org/book/ch01.html#searching-text" rel="nofollow noreferrer">"concordance" method</a>,灵感来自<a href="https://stackoverflow.com/questions/8898131/calling-nltks-concordance-how-to-get-text-before-after-a-word-that-was-used">Calling NLTK's concordance - how to get text before/after a word that was used?</a>:</p>
<blockquote>
<p>A concordance view shows us every occurrence of a <em>given word, together
with some context</em>.</p>
</blockquote>
<pre><code>import nltk
def get_neighbors(input_text, word, before, after):
text = nltk.Text(nltk.tokenize.word_tokenize(input_text))
concordance_index = nltk.ConcordanceIndex(text.tokens)
offset = next(offset for offset in concordance_index.offsets(word))
return text.tokens[offset - before - 1: offset] + text.tokens[offset: offset + after + 1]
text = u"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
print(get_neighbors(text, 'laboris', 5, 2))
</code></pre>
<p>在目标单词之前打印5个单词/标记,在目标单词之后打印2个单词/标记:</p>
^{pr2}$