从两个字符串构建回文

2 投票

1 回答

141 浏览

提问于 2025-04-13 21:21

我想写一个Python函数，效率要高：

这个函数会接收两个字符串，'a'和'b'，然后尝试找出可以组合成的最长回文字符串。这个回文字符串是由'a'中的一个非空子串和'b'中的一个非空子串拼接而成。如果有多个符合条件的答案，它会返回字典序最小的那个。如果找不到这样的字符串，它会返回'-1'。

我现在有一个效率不高的解决方案，它会生成两个字符串的所有子串，然后创建所有可能的拼接，同时追踪最长的有效回文字符串：

def is_palindrome(word):
    """Check if a word is a palindrome."""
    reversed_word = word[::-1]
    return word == reversed_word


def all_substrings_of_word(word):
    """Generate all possible non-empty substrings of a given string."""
    substrings = []
    for sub_string_length in range(1, len(word) + 1):
        for i in range(len(word) - sub_string_length + 1):
            new_word = word[i:i + sub_string_length]
            substrings.append(new_word)
    return substrings


def buildPalindrome(a, b):
    """Attempt to find the longest palindromic string created by concatenating
    a substring of `a` with a substring of `b`."""
    sub_strings_a = all_substrings_of_word(a)
    sub_strings_b = all_substrings_of_word(b)

    # Generate all possible concatenations of substrings from `a` and `b`
    multiplexed_array = [
        word_a + word_b for word_a in sub_strings_a for word_b in sub_strings_b]

    # Find the best palindrome (longest, then lexicographically smallest)
    best_palindrome = ""
    for word in multiplexed_array:
        if is_palindrome(word):
            if len(word) > len(best_palindrome):
                best_palindrome = word
            elif len(word) == len(best_palindrome) and word < best_palindrome:
                best_palindrome = word

    return best_palindrome if best_palindrome else "-1"

print(buildPalindrome("bac", "bac"))   # EXPECTED OUTPUT -- aba
print(buildPalindrome("abc", "def"))   # EXPECTED OUTPUT -- -1
print(buildPalindrome("jdfh", "fds"))   # EXPECTED OUTPUT -- dfhfd

请问我该如何改进这个方法呢？

数据结构字符串处理算法优化字典序组合数学回文子串

1 个回答

你可以试试这个方法：

为字符串 b 中的所有子串建立一个字典树（trie）。如果用后缀树（suffix tree）会更好，因为它更高效。
考虑字符串 a 中所有可能的“中心”，这些中心是潜在的回文字符串。中心可以在两个相邻字符之间（当回文长度为偶数时），或者在一个字符上（当回文长度为奇数时）。对于每一个中心，做以下操作：
- 在这个中心找到最大的回文 p，只考虑字符串 a。
- 向左扩展 p，只要添加的字符（按照添加的顺序）在 b 的字典树中是一个单词。这是一个潜在的解决方案。将其与目前找到的最长回文进行比较，保留更长的那个。
- 如果无法以这种方式扩展 p，那么就缩短 p，直到移除的字符中有一个存在于 b 中。在这种情况下，我们有一个潜在的解决方案。
- 如果在最后的情况下，p 中没有任何字符出现在 b 中，那么在这个选择的中心就没有合适的回文。

然后反过来，把上面的过程应用到 a 和 b 的反转上，也就是说 a 变成 b 的反转，b 变成 a 的反转。这实际上意味着我们在原始的 b 中寻找回文中心。

这里有一个实现这个想法的代码：

# Of the two given strings choose the one that is longest, or if equal, comes first in lexical order
def longest(x, y):
    return min((-len(x), x), (-len(y), y))[1]

def buildTrie(s):
    trie = {}
    for i in range(len(s)):
        node = trie
        for j in range(i, len(s)):
            node = node.setdefault(s[j], {})
    return trie

def buildPalindromeTrincot(s1, s2):
    palin = ""
    # Two passes: one for where the center of a palindrome is in s1, one for the reverse case
    for a, b in ((s1, s2), (s2[::-1], s1[::-1])):
        # Build a trie for B substrings (could be suffixtree for better performance)
        trie = buildTrie(b)
        n = len(a)
        # Visit all possible centers in A for a potential solution of at least 2 characters 
        for center in range(2*n-1, 0, -1):
            # Get the offsets of the innermost characters that must be equal 
            #   for a palindrome of at least two characters
            mid1 = (center - 1)//2
            mid2 = (center + 2)//2
            # Get largest palindrome at this center in A alone: 
            #   `left` will point to the left-neighboring character to that palindrome
            left = next((left for left, right in zip(range(mid1, 0, -1), range(mid2, n)) 
                         if a[left] != a[right]), 
                        max(0, mid1 + mid2 - n))
            # Must extend the palindrome with a substring from B
            node = trie.get(a[left], None)
            if node is not None:  # We can extend the palindrome using B
                for left in range(left-1, -1, -1):
                    if a[left] not in node:
                        left += 1
                        break
                    node = node[a[left]]                    
            else: 
                # See if we can drop characters from the palindrome in A 
                #    until we can replace one with the same character from B
                left = next((left for left in range(left+1, mid1+1) if a[left] in trie), None)
                if left is None:
                    continue  # No solution found here
            palin = longest(a[left:mid2] + a[left:mid1+1][::-1], palin)
                
    return palin or "-1"

对于大约40个字符的输入，这个实现比你提供的原始代码快100倍。对于70个字符的输入，这个速度提升达到1000倍。对于500个字符的输入，这个实现能在不到一秒的时间内给出答案。

还有一些改进的空间，比如：

使用后缀树代替字典树
在未来的回文不可能比已经找到的更长时提前退出
让中心从 a+b 的中心向外“扇形”扩展，这样可以更快找到更大的回文。为此，你需要先建立两个字典树，因为你会在两者之间切换。

不过，上面的代码已经带来了显著的改进，所以我没有继续追求这些改进。

回答于 2025-04-13 由 Python大师

分享举报

从两个字符串构建回文

1 个回答

撰写回答