Python 如何将字符串合并为最短的字符串?

2 投票
2 回答
1929 浏览
提问于 2025-04-16 09:59

如果我有一串字符串,我想把它们合并成一个字符串,并且要考虑到重叠的字符。如果没有重叠的字符串了,就把它直接加到最后。下面是一个非常简单的例子。

input = ['one', 'two']
output = 'twone'

我希望能找到一种方法,能够对输入列表中的任意数量的字符串进行这样的合并。

谢谢,
giodamelio

2 个回答

2

你可以尝试使用一种改进版的 Needleman-Wunsch 算法。不过从你的问题来看,我不太清楚你想要的组合字符串需要满足什么条件(比如一个字符串是否必须完整地嵌入到另一个字符串中,还是说两个字符串的字符可以交错出现?),所以我就不帮你修改了,但这应该能给你一个好的开始。

可以查看 http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm

还有,http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf

5

仅仅给一个简单的例子并不足够。这真的是今年(农历)的一个非常模糊的问题。不过假设重叠只发生在单词的两端,并且每个单词只测试两次(分别与当前输出的两端进行比较),如果你想要最大的重叠,这段代码可以完成这个任务:

[在修复错误后编辑]

def glue(a, b):
    maxn = 0
    for n in xrange(1, 1 + min(len(a), len(b))):
        suffix = a[-n:]
        prefix = b[:n]
        if prefix == suffix:
            maxn = n
    # BUG: return maxn, a[:-maxn] + b
    # FAILS when maxn == 0
    # EXTRA TEST: ['nil', 'overlap']
    return a + b[maxn:]     


def multiglue(words):
    if not words: return ""
    result = words[0]
    for word in words[1:]:
        nx, rx = glue(result, word)
        ny, ry = glue(word, result)
        result = rx if nx > ny else ry
    return result

tests = [line.split() for line in """
    one
    two one
    one two
    overlap nil
    nil overlap
    toad dog rabbit
    frog ogham
    ogham frog
    hopper grasshopper
    grass grasshopper person
    foooo oooof
    oooof foooo""".splitlines()]

for test in tests:
    out = multiglue(test)
    print test, repr(out)

输出:

[] ''
['one'] 'one'
['two', 'one'] 'twone'
['one', 'two'] 'twone'
['overlap', 'nil'] 'niloverlap'
['nil', 'overlap'] 'overlapnil'
['toad', 'dog', 'rabbit'] 'rabbitoadog'
['frog', 'ogham'] 'frogham'
['ogham', 'frog'] 'frogham'
['hopper', 'grasshopper'] 'grasshopper'
['grass', 'grasshopper', 'person'] 'grasshopperson'
['foooo', 'oooof'] 'foooof'
['oooof', 'foooo'] 'foooof'

撰写回答