Python 如何将字符串合并为最短的字符串?
如果我有一串字符串,我想把它们合并成一个字符串,并且要考虑到重叠的字符。如果没有重叠的字符串了,就把它直接加到最后。下面是一个非常简单的例子。
input = ['one', 'two']
output = 'twone'
我希望能找到一种方法,能够对输入列表中的任意数量的字符串进行这样的合并。
谢谢,
giodamelio
2 个回答
2
你可以尝试使用一种改进版的 Needleman-Wunsch 算法。不过从你的问题来看,我不太清楚你想要的组合字符串需要满足什么条件(比如一个字符串是否必须完整地嵌入到另一个字符串中,还是说两个字符串的字符可以交错出现?),所以我就不帮你修改了,但这应该能给你一个好的开始。
可以查看 http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm
还有,http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf
5
仅仅给一个简单的例子并不足够。这真的是今年(农历)的一个非常模糊的问题。不过假设重叠只发生在单词的两端,并且每个单词只测试两次(分别与当前输出的两端进行比较),如果你想要最大的重叠,这段代码可以完成这个任务:
[在修复错误后编辑]
def glue(a, b):
maxn = 0
for n in xrange(1, 1 + min(len(a), len(b))):
suffix = a[-n:]
prefix = b[:n]
if prefix == suffix:
maxn = n
# BUG: return maxn, a[:-maxn] + b
# FAILS when maxn == 0
# EXTRA TEST: ['nil', 'overlap']
return a + b[maxn:]
def multiglue(words):
if not words: return ""
result = words[0]
for word in words[1:]:
nx, rx = glue(result, word)
ny, ry = glue(word, result)
result = rx if nx > ny else ry
return result
tests = [line.split() for line in """
one
two one
one two
overlap nil
nil overlap
toad dog rabbit
frog ogham
ogham frog
hopper grasshopper
grass grasshopper person
foooo oooof
oooof foooo""".splitlines()]
for test in tests:
out = multiglue(test)
print test, repr(out)
输出:
[] ''
['one'] 'one'
['two', 'one'] 'twone'
['one', 'two'] 'twone'
['overlap', 'nil'] 'niloverlap'
['nil', 'overlap'] 'overlapnil'
['toad', 'dog', 'rabbit'] 'rabbitoadog'
['frog', 'ogham'] 'frogham'
['ogham', 'frog'] 'frogham'
['hopper', 'grasshopper'] 'grasshopper'
['grass', 'grasshopper', 'person'] 'grasshopperson'
['foooo', 'oooof'] 'foooof'
['oooof', 'foooo'] 'foooof'