python中的模糊字符串匹配

fuzzywuzzymit的Python项目详细描述


https://travis-ci.org/graingert/fuzzywuzzymit.svg?branch=master

fuzzywuzzymit

像老板一样的模糊字符串匹配。它使用Levenshtein Distance来计算简单易用包中序列之间的差异。

要求

  • python 2.4或更高版本
  • difflib

用于测试

  • PycodeStyle
  • 假设
  • pytest

安装

通过pypi使用pip

pip install fuzzywuzzymit

通过github使用pip

pip install git+git://github.com/graingert/fuzzywuzzymit.git@0.16.0#egg=fuzzywuzzymit

添加到requirements.txt文件(之后运行pip install -r requirements.txt

git+ssh://git@github.com/graingert/fuzzywuzzymit.git@0.16.0#egg=fuzzywuzzymit

通过git手动操作

git clone git://github.com/graingert/fuzzywuzzymit.git fuzzywuzzymit
cd fuzzywuzzymit
python setup.py install

用法

>>>fromfuzzywuzzymitimportfuzz>>>fromfuzzywuzzymitimportprocess

简单比率

>>>fuzz.ratio("this is a test","this is a test!")97

部分比率

>>>fuzz.partial_ratio("this is a test","this is a test!")100

令牌排序比率

>>>fuzz.ratio("fuzzy wuzzy was a bear","wuzzy fuzzy was a bear")91>>>fuzz.token_sort_ratio("fuzzy wuzzy was a bear","wuzzy fuzzy was a bear")100

令牌集比率

>>>fuzz.token_sort_ratio("fuzzy was a bear","fuzzy fuzzy was a bear")84>>>fuzz.token_set_ratio("fuzzy was a bear","fuzzy fuzzy was a bear")100

过程

>>>choices=["Atlanta Falcons","New York Jets","New York Giants","Dallas Cowboys"]>>>process.extract("new york jets",choices,limit=2)[('New York Jets',100),('New York Giants',78)]>>>process.extractOne("cowboys",choices)("Dallas Cowboys",90)

您还可以将其他参数传递给extractOne方法,使其使用特定的记分器。典型的用例是匹配文件路径:

>>>process.extractOne("System of a down - Hypnotize - Heroin",songs)('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3',86)>>>process.extractOne("System of a down - Hypnotize - Heroin",songs,scorer=fuzz.token_sort_ratio)("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3",61)

已知端口

fuzzywuzzymit也被移植到其他语言中!以下是我们知道的几个端口:

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java如何中止阻塞进程。getInputStream()。用close()读取()吗?   版本组合(Java9Jigsaw项目中的Maven/Log4J)   java如何编写将列出的字符串与前后空格相匹配的sql查询   java如何解决代码中的“javax.crypto.IllegalBlockSizeException:解密中的最后一个块未完成”   java NullPointerException正在尝试访问字符串资源   MVC上特定字段的java观察者模式   java在尝试调用factory组织上的“createWorkbook”时。阿帕奇。波伊。xssf。用户模型。XSSFWorkbookFactory和参数   java将目录和文件从一个windows服务器复制到另一个windows服务器   Xamarin Android绑定   java如何让maven使用测试资源   java在MySQL中存储美元金额:INT vs LONG vs BIGINT   Java算法:如何在大型二进制文件中查找字符串模式?   仅java键查询,使用键