易于与Flask应用程序集成的自动完成模型

markov_autocomplete的Python项目详细描述


马尔可夫自动完成

隐马尔可夫模型生成自动完成建议。

如何使用

这个模型可以用你自己的句子列表来训练。

例如,如果我们想使用《鲁滨逊漂流记》的前两段进行训练

from markov_autocomplete.autocomplete import Autocomplete

sentences = ["I WAS born in the year 1632, in the city of York, of a good family, though not of that country, my father being a foreigner of Bremen, who settled first at Hull. He got a good estate by merchandise, and leaving off his trade, lived afterwards at York, from whence he had married my mother, whose relations were named Robinson, a very good family in that country, and from whom I was called Robinson Kreutznaer; but, by the usual corruption of words in England, we are now called - nay we call ourselves and write our name - Crusoe; and so my companions always called me.", "I had two elder brothers, one of whom was lieutenant-colonel to an English regiment of foot in Flanders, formerly commanded by the famous Colonel Lockhart, and was killed at the battle near Dunkirk against the Spaniards. What became of my second brother I never knew, any more than my father or mother knew what became of me."]

ac = Autocomplete(model_path = "ngram", sentences = sentences, n_model=3, n_candidates=10, match_model="middle", min_freq=0, punctuations="", lowercase=True)

ac.predictions("country")

工作原理

给定一个输入字符串,该字符串由nwordswww1,…,wwn组成,该模型从语言模型中预测以下单词wwu1}

<>>>>W{{n+1 }的最大可能候选是用极大值

计算的。

p(w{n+1}w{n,…,w{n-o+2})

其中,o是模型的顺序。

一旦计算出最佳候选,整个句子的概率近似为N-gram模型

p(w{1,…,w{n,w{n+1})=生产(w{i{w{i-n-1},…,w{i-1})

例如,对于2克模型,我们有

p(w1,w2,w3,w4)=p(w1)p(w2 w1)p(w3 w2)p(w4 w3)

另一方面,对于3克模型,我们有

p(w1,w2,w3,w4)=p(w1)p(w2 w1)p(w3 w1,w2)p(w4 w2,w3)

高阶模型会更精确,但代价是生成大量的n-grams,这可能会对存储空间和计算时间产生负面影响。

如果输入字符串包含的单词少于模型的顺序,则自动完成程序将计算模型的同一顺序中最有可能的n-gram。

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java如何在任何一个特定端口收到短信时自动打开Android应用程序?   Android/Java如何创建int数组   Android Java:启动活动时出现问题,错误导致类片段膨胀   Java方法来接收请求并生成Oauth签名   PDFBox中的java渐变笔划   java如何使用字符串从数组列表中获取数据   java如何让WebSphere项目在tc server下运行?   scala SPARK:java。lang.IllegalStateException:找不到任何生成目录   java如何找到集合类型?   java如何编写Firebase Firestore多对多关系的读取规则   java ListView滚动方向   在Java中从URL播放wav文件时获得“UnsupportedFileException”音频   java将X&Y转换为Lat&Lon   数据结构如何知道Java中同一映射中是否有两个相同的元素   使用Java客户端创建ElasticSearch映射时发生AbstractMethodError   java如何从Android Studio中的倒计时计时器返回标志数组的值?   Java将char从方法传递回main   c#。适用于windows mobile的Net web浏览器