假设我有一个列表:-在
person_name = ['zakesh', 'oldman LLC', 'bikash', 'goldman LLC', 'zikash','rakesh']
我试图以这样的方式对列表进行分组,这样两个字符串之间的Levenshtein distance最大。为了计算两个单词之间的比率,我使用了python包fuzzywuzzy。在
示例:-
^{pr2}$我的最终目标:
My end goal is to group the words such that Levenshtein distance between them is more than 80 percent?
我的名单应该是这样的:
person_name = ['bikash', 'zikash', 'rakesh', 'zakesh', 'goldman LLC', 'oldman LLC'] because the distance between `bikash` and `zikash` is very high so they should be together.
代码:
我试图通过排序来实现这一点,但是键函数应该是fuzz.ratio
。下面的代码不起作用,但我正从这个角度来解决这个问题。在
from fuzzywuzzy import fuzz
combined_list = ['rakesh', 'zakesh', 'bikash', 'zikash', 'goldman LLC', 'oldman LLC']
combined_list.sort(key=lambda x, y: fuzz.ratio(x, y))
print combined_list
Could anyone help me to combine the words so that Levenshtein distance between them is more than 80 percent?
这将对名称进行分组
生产
^{2}$如您所见,这些名称被正确地分组,但顺序可能不是您想要的顺序。在
相关问题 更多 >
编程相关推荐