擅长:python、mysql、java
<p>我相信<a href="http://www.copyscape.com/" rel="nofollow noreferrer"><strong>Copyscape</strong></a>使用<strong>4-grams</strong>来帮助确定唯一性。你知道吗</p>
<p>这些字符串称为<a href="http://en.wikipedia.org/wiki/N-gram" rel="nofollow noreferrer"><strong>N-Grams</strong></a>。你知道吗</p>
<p>但是,<a href="https://stackoverflow.com/a/653165/1333791"><strong>another SO answer</strong></a>以字符为基础链接到<a href="http://www.catalysoft.com/articles/StrikeAMatch.html" rel="nofollow noreferrer"><strong>language independent algo comparing bi-grams</strong></a>。它已经用Java实现了,这将有助于节省时间。你知道吗</p>