查找输入字符串和固定字符串集之间的匹配项

inputString = "love".lower() fixedString = "love".lower() lenComp = min(len(inputString), len(fixedString)) counter = 0 for i in range(lenComp): if inputString[i] == fixedString[i]: # Partial match counter += 1 if counter == len(inputString): # Full match counter += 10 print(counter) # 14

1条回答

网友
1楼 · 发布于 2024-05-23 18:29:54

这只是一个探索的想法，也许一个修改过的前缀树可以完成这项工作，您需要添加每个字符串及其所有子字符串，并跟踪每个字符在树中其他位置的位置
例如：ABD、BAD、RCDB可以添加到前缀树中，如下所示
假设要查找CAB的匹配项，则需要一个获取字符串和起始位置的搜索函数
search(CAB,0) this will try to find a first node C with position 0 but won't find one
然后移除C并搜索位置为1的AB
search(AB,1)
这将找到A节点，该节点表示A存在于坏节点的位置1，您将遵循该链接。等等
编辑：
您可以通过从根节点开始并将字母表中的所有字母添加到该节点来创建树。每个字母都有一个位置字典
node = { ... pos : { } }
最初，所有字母的位置字典都是空的，也就是说，它们在任何单词中都不会用作起始字符或中间字符
加上ABD这个词，逻辑是这样的
Letter 0 - A update pos dict for A to be {0 : []} the contents of the array for pos 0 are not used (i think) Letter 1 - B add node B under A update pos for node B under root to {1 : ['A']} Letter 2 - D add node D under B update pos for node D under root to {2 : ['AB']} note: it is set to AB so it helps you walk down the tree quicker when doing a search
接下来我们加上坏的
Letter 0 - B update pos for node B to be {0:[], 1:['A']} Letter 1 - A add node A under B update pos for Node A under root to be {0:[],1:['B']} letter 2 - D add node D under B of the chain BA update pos for node D under root to be {2 : ['AB','BA']}
要搜索CAD这个词，您需要
检查顶层根目录下的节点C，该节点有空pos dict{} 所以没有以C开头的单词，所以我们可以跳过C
移动到pos=1的root下的下一个char check节点A，它有pos={0:[]，1:['B']}，因此获取键1的数组，这告诉您将在root B节点下找到A，从这里遍历树以找到最长的匹配，这将为您提供A（BD）
移动到下一个char D，检查pos 2的根节点D，它有{2:['AB'，'BA']}，因此您遍历树以查找匹配的AB（D）和BA（D）
编辑2：
这是一个非常简单（且效率低下）的python实现：
import string, pprint pp = pprint.PrettyPrinter(indent=4) root = {x:{} for x in string.ascii_lowercase} def add_word(word): def _add_word(node, hist, word, pos): if word[0] not in node: node[word[0]] = {} root[word[0]].setdefault("pos",{}).setdefault(pos,set()).add(hist+word[0]) if len(word) == 1: node[word[0]]['is_word'] = True else: _add_word(node[word[0]], hist+word[0], word[1:], pos+1) _add_word(root, "", word.lower(), 0) def search(word): found_words = set() def find(node,prefix,word,pos,result): # print("prefix:",prefix,", word:",word,", pos:", pos,", result:",result) if prefix: find(node[prefix[0]], prefix[1:], word, pos, result+prefix[0]) else: if node.get('is_word',False): found_words.add(result) for x in string.ascii_lowercase: if x in node: find(node[x], "", "", pos, result+x) for i in range(len(word)): if i in root[word[i]].get("pos",{}): for prefix in root[word[i]]["pos"][i]: # print("calling find with: ", prefix, word, i) find(root,prefix,word[i:],i,"") print(f"result for {word}:", found_words) add_word('apple') add_word('bullcrap') add_word('tomato') add_word('apricot') # pp.pprint(root) search('applecrab') search('dove')
运行上述程序的输出为：
result for applecrab: {'apple', 'bullcrap', 'apricot'} result for dove: {'tomato'}

下面是如何匹配的示例：

相关问题更多 >

编程相关推荐

热门问题

热门文章