通过跳过字符来匹配字符串/序列

2024-05-16 23:58:25 发布

您现在位置:Python中文网/ 问答频道 /正文

将字符串与多个子字符串匹配:可以直接匹配子字符串,也可以跳过字符进行匹配

比如说,

输入-AABCCAADABDC
子字符串-AABABDC

AABABDC是有效序列:

  • BDC是直接匹配
  • AABA通过跳过C进行匹配

如何通过跳过字符来匹配子字符串?谢谢你的帮助


Tags: 字符串序列字符aababdcaabccaadabdc
1条回答
网友
1楼 · 发布于 2024-05-16 23:58:25
def match(input_seq, substring, threshold=2):
    '''
    Accepts @param: input_seq and
    @param: substring to check if
    substring pattern is present
    in @input_seq
    '''
    is_direct_match = substring in input_seq
    if is_direct_match:
        return True
    
    substring_idx = 0
    input_seq_idx = 0
    input_seq_revisit_idx = 0

    is_matching = False
    num_char_miss = 0

    while(input_seq_idx < len(input_seq)):
        substring_char = substring[substring_idx]
        input_seq_char = input_seq[input_seq_idx]
        input_seq_idx = input_seq_idx + 1
        
        if substring_char == input_seq_char:
            if not is_matching: # first character matched
                is_matching = True
                input_seq_revisit_idx = input_seq_idx
            substring_idx = substring_idx + 1
        elif is_matching:
            num_char_miss = num_char_miss + 1
        
        if num_char_miss > threshold: # reset and start a new search
            num_char_miss = 0
            substring_idx = 0
            input_seq_idx = input_seq_revisit_idx
            is_matching = False
        if substring_idx == len(substring):
            break
        # print(input_seq_char, substring_char, input_seq_char == substring_char, is_matching, num_char_miss)

    is_skip_match = substring_idx == len(substring)
    return is_skip_match

if __name__ == "__main__":
    input_seq = "AABCCCAADAAABCCABDC"
    substrings = ["AABA", "BDC", "AEABA"]

    for substring in substrings:
        is_valid_seq = match(input_seq=input_seq, substring=substring, threshold=2)
        result = "is a valid"
        if not is_valid_seq:
            result = "is not a valid"

        print("{} {} sequence in {}".format(substring, result, input_seq))

上面的代码使用in检查直接匹配。对于跳过匹配,每个字符都与输入序列匹配。如果子字符串中的所有字符在输入序列中不匹配,则该序列无效

如果你有任何疑问,请告诉我

快乐编码

相关问题 更多 >