Python symspellp包_程序模块 - PyPI

python符号拼写

symspellp的Python项目详细描述

符号拼写

symspellpy是SymSpellv6.3的一个python端口，它提供了更高的速度和更低的内存消耗。单元测试从原来的项目都是为了保证港口的准确性而实施的。

请注意，端口尚未针对速度进行优化。

用法

安装`symspellpy`模块

pip install -U symspellpy

将频率字典复制到项目

复制frequency_dictionary_en_82_765.txt（位于内部symspellpy 目录）到您的项目目录，这样您最终会得到以下布局：

project_dir
  +-frequency_dictionary_en_82_765.txt
  \-project.py

添加新术语

使用load_dictionary(corpus=<path/to/dictionary.txt>, <term_index>,<count_index>)。dictionary.txt应该包含：

<term> <count>
<term> <count>
...
<term> <count>

其中，term_index表示术语的列数，count_index表示计数/频率的列数。

将<term> <count>附加到提供的frequency_dictionary_en_82_765.txt
使用方法create_dictionary_entry(key=<term>, count=<count>)

示例用法（`create_dictionary`

importosfromsymspellpy.symspellpyimportSymSpell# import the moduledefmain():# maximum edit distance per dictionary precalculationmax_edit_distance_dictionary=2prefix_length=7# create objectsym_spell=SymSpell(max_edit_distance_dictionary,prefix_length)# create dictionary using corpus.txtifnotsym_spell.create_dictionary(<path/to/corpus.txt>):print("Corpus file not found")returnforkey,countinsym_spell.words.items():print("{}{}".format(key,count))if__name__=="__main__":main()

corpus.txt应该包含：

abc abc-def abc_def abc'def abc qwe qwe1 1qwe q1we 1234 1234

预期输出：

abc 4
def 2
abc'def 1
qwe 1
qwe1 1
1qwe 1
q1we 1
1234 2

示例用法（`lookup`和`lookup_compound`）

使用project.py（代码比允许解释方法参数所需的代码更详细）

importosfromsymspellpy.symspellpyimportSymSpell,Verbosity# import the moduledefmain():# maximum edit distance per dictionary precalculationmax_edit_distance_dictionary=2prefix_length=7# create objectsym_spell=SymSpell(max_edit_distance_dictionary,prefix_length)# load dictionarydictionary_path=os.path.join(os.path.dirname(__file__),"frequency_dictionary_en_82_765.txt")term_index=0# column of the term in the dictionary text filecount_index=1# column of the term frequency in the dictionary text fileifnotsym_spell.load_dictionary(dictionary_path,term_index,count_index):print("Dictionary file not found")return# lookup suggestions for single-word input stringsinput_term="memebers"# misspelling of "members"# max edit distance per lookup# (max_edit_distance_lookup <= max_edit_distance_dictionary)max_edit_distance_lookup=2suggestion_verbosity=Verbosity.CLOSEST# TOP, CLOSEST, ALLsuggestions=sym_spell.lookup(input_term,suggestion_verbosity,max_edit_distance_lookup)# display suggestion term, term frequency, and edit distanceforsuggestioninsuggestions:print("{}, {}, {}".format(suggestion.term,suggestion.distance,suggestion.count))# lookup suggestions for multi-word input strings (supports compound# splitting & merging)input_term=("whereis th elove hehad dated forImuch of thepast who ""couqdn'tread in sixtgrade and ins pired him")# max edit distance per lookup (per single word, not per whole input string)max_edit_distance_lookup=2suggestions=sym_spell.lookup_compound(input_term,max_edit_distance_lookup)# display suggestion term, edit distance, and term frequencyforsuggestioninsuggestions:print("{}, {}, {}".format(suggestion.term,suggestion.distance,suggestion.count))if__name__=="__main__":main()

预期产量：

members, 1, 226656153

where is the love he had dated for much of the past who couldn't read in six grade and inspired him, 9, 300000

示例用法（`word_segmentation`）

使用project.py（代码比允许解释方法参数）

importosfromsymspellpy.symspellpyimportSymSpell# import the moduledefmain():# maximum edit distance per dictionary precalculationmax_edit_distance_dictionary=0prefix_length=7# create objectsym_spell=SymSpell(max_edit_distance_dictionary,prefix_length)# load dictionarydictionary_path=os.path.join(os.path.dirname(__file__),"frequency_dictionary_en_82_765.txt")term_index=0# column of the term in the dictionary text filecount_index=1# column of the term frequency in the dictionary text fileifnotsym_spell.load_dictionary(dictionary_path,term_index,count_index):print("Dictionary file not found")return# a sentence without any spacesinput_term="thequickbrownfoxjumpsoverthelazydog"result=sym_spell.word_segmentation(input_term)# display suggestion term, term frequency, and edit distanceprint("{}, {}, {}".format(result.corrected_string,result.distance_sum,result.log_prob_sum))if__name__=="__main__":main()

预期产量：

the quick brown fox jumps over the lazy dog 8 -34.491167981910635

输送套管

从原来的短语转换大小写要更正输入错误，请使用的transfer_casing布尔标志 lookup()和lookup_compound()方法：

lookup_compound()：

suggestions = sym_spell.lookup_compound(input_term,
                                        max_edit_distance_lookup,
                                        transfer_casing=True)

lookup()：

suggestions = sym_spell.lookup(input_term,
                               suggestion_verbosity,
                               max_edit_distance_lookup,
                               transfer_casing=True)

变更日志

6.3.9（2019-08-06）

将transfer_casing添加到lookup和lookup_compound
固定前缀长度签入_edits_prefix

6.3.8（2019-03-21）

实现delete_dictionary_entry
通过使用python内置哈希来提高性能
添加了pickle的版本控制

6.3.7（2019-02-18）

在lookup

include_unknown

删除了未使用的initial_capacity参数
提高了_get_str_hash性能
实现了save_pickle和load_pickle，以避免创建每次都查字典

6.3.6（2019-02-11）

添加了create_dictionary()功能

6.3.5（2019-01-14）

修复了lookup_compound()以返回正确的distance

6.3.4（2019-01-04）

添加<self._replaced_words = dict()>以跟踪拼写错误的单词数
将ignore_token添加到word_segmentation()以忽略正则表达式的单词

6.3.3（2018-12-05）

添加了word_segmentation()功能

6.3.2（2018-10-23）

将encoding选项添加到load_dictionary()

6.3.1（2018-08-30）

为symspellpy

6.3.0（2018-08-13）

移植的SymSpellv6.3

欢迎加入QQ群-->： 979659372

symspellpy 6.3.9

symspellp的Python项目详细描述

符号拼写

用法

安装symspellpy模块

将频率字典复制到项目

添加新术语

示例用法（create_dictionary

示例用法（lookup和lookup_compound）

预期产量：

示例用法（word_segmentation）

预期产量：

输送套管

变更日志

6.3.9（2019-08-06）

6.3.8（2019-03-21）

6.3.7（2019-02-18）

6.3.6（2019-02-11）

6.3.5（2019-01-14）

6.3.4（2019-01-04）

6.3.3（2018-12-05）

6.3.2（2018-10-23）

6.3.1（2018-08-30）

6.3.0（2018-08-13）

推荐PyPI第三方库

udpbot

stbt-core

usb

cmdtime

easy-ocr

pdb-manip-p

keep-sabbath

EzPDFConverter

sgaimdbpull

netspeech

pysha

sparkip

djangopubsub

PqrUpload4-pkg-petronije

odoo13-addon-mgmtsystem-audit

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

安装`symspellpy`模块

示例用法（`create_dictionary`

示例用法（`lookup`和`lookup_compound`）

示例用法（`word_segmentation`）

导航栏

项目链接

标签