比较Python中的两个文本块

3条回答

网友

1楼 · 编辑于 2024-04-20 01:45:46

一种原始的方法。。。但是，您可以遍历字符串，比较另一个字符串中的等效序列字，得到匹配失败的比率：

>>> aa = 'One day a man walked over the hill and saw the sun'
>>> bb = 'One day a man walked over a hill and saw the sun'
>>> matches = [a == b for a, b in zip(aa.split(' '), bb.split(' '))]
>>> matches
[True, True, True, True, True, True, False, True, True, True, True, True]
>>> sum(matches)
11
>>> len(matches)
12

所以在这个例子中，你可以看到11/12个单词匹配。然后可以设置通过/失败级别

网友

2楼 · 编辑于 2024-04-20 01:45:46

看看你的问题，difflib.SequenceMatcher.ratio()可能会派上用场。在

这个漂亮的例程，使用两个字符串并计算[0,1]范围内的相似性指数

快速演示

>>> for a,b in list(itertools.product(st, st)):
    print "Text 1 {}".format(a)
    print "Text 2 {}".format(b)
    print "Similarity Index {}".format(difflib.SequenceMatcher(None, a,b).ratio())
    print '-'*80


Text 1 One day a man walked over the hill and saw the sun
Text 2 One day a man walked over the hill and saw the sun
Similarity Index 1.0
--------------------------------------------------------------------------------
Text 1 One day a man walked over the hill and saw the sun
Text 2 One week a woman looked over a hill and saw the sun
Similarity Index 0.831683168317
--------------------------------------------------------------------------------
Text 1 One day a man walked over the hill and saw the sun
Text 2 One day a man walked over a hill and saw the sun
Similarity Index 0.959183673469
--------------------------------------------------------------------------------
Text 1 One week a woman looked over a hill and saw the sun
Text 2 One day a man walked over the hill and saw the sun
Similarity Index 0.831683168317
--------------------------------------------------------------------------------
Text 1 One week a woman looked over a hill and saw the sun
Text 2 One week a woman looked over a hill and saw the sun
Similarity Index 1.0
--------------------------------------------------------------------------------
Text 1 One week a woman looked over a hill and saw the sun
Text 2 One day a man walked over a hill and saw the sun
Similarity Index 0.868686868687
--------------------------------------------------------------------------------
Text 1 One day a man walked over a hill and saw the sun
Text 2 One day a man walked over the hill and saw the sun
Similarity Index 0.959183673469
--------------------------------------------------------------------------------
Text 1 One day a man walked over a hill and saw the sun
Text 2 One week a woman looked over a hill and saw the sun
Similarity Index 0.868686868687
--------------------------------------------------------------------------------
Text 1 One day a man walked over a hill and saw the sun
Text 2 One day a man walked over a hill and saw the sun
Similarity Index 1.0
--------------------------------------------------------------------------------

网友

3楼 · 编辑于 2024-04-20 01:45:46

有几个python库可以帮助您实现这一点。看看这个Q:。在

levisein距离是一种常用的算法。我发现NYSII算法非常有用。尤其是如果你想在数据库中保存一个字符串表示。在

这个link将为您提供一个极好的概述：

快速演示

相关问题更多 >

编程相关推荐

热门问题

热门文章

比较Python中的两个文本块

快速演示

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >