检测数字序列中的重复循环（python）

9 投票

2 回答

9797 浏览

提问于 2025-04-17 09:13

我在想，这种做法有没有比较“常见”或者正常的方式。我并不是在寻找那种最简短的答案，比如只用两行代码之类的。我只是快速写了这段代码，但总觉得里面的内容太多了。如果有什么库可以帮助实现这个，那就太好了。

def get_cycle(line):
    nums = line.strip().split(' ')

    # 2 main loops, for x and y
    for x in range(2, len(nums)): # (starts at 2, assuming the sequence requires at least 2 members)
        for y in range(0, x):
            # if x is already in numbers before it
            if nums[x] == nums[y]:
                seq = [nums[x]] # (re)start the sequence
                adder = 1       # (re)set the adder to 1
                ok = True       # (re)set ok to be True
                # while the sequence still matches (is ok) and
                # tail of y hasn't reached start of x
                while ok and y + adder < x:
                    if nums[x + adder] == nums[y + adder]:  # if next y and x match
                        seq.append(nums[x + adder])         # add the number to sequence
                        adder += 1                          # increase adder
                    else:
                        ok = False                          # else the sequence is broken
                # if the sequence wasn't broken and has at least 2 members
                if ok and len(seq) > 1:
                    print(' '.join(seq))    # print it out, separated by an empty space
                    return

编程库算法优化数字序列重复检测

2 个回答

你的问题其实是“从x到x+k的所有项目是否和从y到y+k的项目匹配”。也就是说，是否有一个长度为k的子集在这行中出现了两次？

而且你希望x到x+k和y到y+k之间没有重叠。简单的方法是把y定义为x加上一个偏移量d。如果你确保k小于等于d，并且d小于line的长度减去x和k的和，那么你就总是在这行的范围内查找。

接下来，你会让k从1变到line长度的一半，寻找在彼此之间有一定偏移的不同长度的重复项。

从x到y的偏移量d会在1到line长度减去x和k的和之间变化。

同样，x的起始位置也会从0变化到line长度的一半。

所以，“所有”的部分可以这样理解：all( line[i] == line[i+d] for i in range(x,x+k) )，对于各种合法的d、x和k的值。

回答于 2025-04-17 由 Python大师

分享举报

我可能没有完全理解这个，但我觉得用正则表达式可以很简单地解决这个问题。

(.+ .+)( \1)+

这里有一个例子：

>>> regex = re.compile(r'(.+ .+)( \1)+')
>>> match = regex.search('3 0 5 5 1 5 1 6 8')
>>> match.group(0)    # entire match
'5 1 5 1'
>>> match.group(1)    # repeating portion
'5 1'
>>> match.start()     # start index of repeating portion
6

>>> match = regex.search('2 0 6 3 1 6 3 1 6 3 1')
>>> match.group(1)
'6 3 1'

它的工作原理是，(.+ .+) 会匹配至少两个数字（尽可能多），并把结果放到捕获组1里。( \1)+ 会匹配一个空格后面跟着捕获组1的内容，至少出现一次。

接下来，我们用字符串 '3 0 5 5 1 5 1 6 8' 来详细解释一下：

(.+ .+) 最开始会匹配整个字符串，但因为 ( \1)+ 会失败，所以它会放弃字符串末尾的一些字符，这个过程会一直回溯，直到 (.+ .+) 无法在字符串的开头匹配为止，这时正则引擎会向前移动，重新尝试。
这个过程会持续到捕获组从第二个5开始，它会放弃末尾的字符，直到捕获到 '5 1'，此时正则表达式会寻找任意数量的 ' 5 1' 来满足 ( \1)+，当然它会找到这个匹配，匹配就成功了。

回答于 2025-04-17 由 Python大师

分享举报

检测数字序列中的重复循环（python）

2 个回答

撰写回答