擅长:python、mysql、java
<p>你并不真的想要一个<code>skipgram</code>本身,但是你想要一个按大小划分的块,试试这个:</p>
<pre><code>from lazyme import per_chunk
tokens = "my name is John".split()
list(per_chunk(tokens, 2))
</code></pre>
<p>[输出]:</p>
<pre><code>[('my', 'name'), ('is', 'John')]
</code></pre>
<p>如果需要滚动窗口,即<code>ngrams</code>:</p>
<pre><code>from lazyme import per_window
tokens = "my name is John".split()
list(per_window(tokens, 2))
</code></pre>
<p>[输出]:</p>
<pre><code>[('my', 'name'), ('name', 'is'), ('is', 'John')]
</code></pre>
<p>与ngrams的NLTK类似:</p>
<pre><code>from nltk import ngrams
tokens = "my name is John".split()
list(ngrams(tokens, 2))
</code></pre>
<p>[输出]:</p>
<pre><code>[('my', 'name'), ('name', 'is'), ('is', 'John')]
</code></pre>
<p>如果你想要实际的技巧图,<a href="https://stackoverflow.com/questions/31847682/how-to-compute-skipgrams-in-python">How to compute skipgrams in python?</a></p>
<pre><code>from nltk import skipgrams
tokens = "my name is John".split()
list(skipgrams(tokens, n=2, k=1))
</code></pre>
<p>[输出]:</p>
<pre><code>[('my', 'name'),
('my', 'is'),
('name', 'is'),
('name', 'John'),
('is', 'John')]
</code></pre>