如何提高这段代码的性能？

43 投票

7 回答

22113 浏览

提问于 2025-04-16 07:42

感谢这里的一些朋友的帮助，我终于让我的塔斯马尼亚骆驼谜题的代码跑起来了。不过，它的速度慢得让人受不了（我觉得是这样。因为这是我写的第一个Python程序，所以不太确定）。代码底部的示例运行在我电脑上花了很长时间才解决：

dumrat@dumrat:~/programming/python$ time python camels.py
[['F', 'F', 'F', 'G', 'B', 'B', 'B'], ['F', 'F', 'G', 'F', 'B', 'B', 'B'],
 ['F', 'F', 'B', 'F', 'G', 'B', 'B'], ['F', 'F', 'B', 'F', 'B', 'G', 'B'],
 ['F', 'F', 'B', 'G', 'B', 'F', 'B'], ['F', 'G', 'B', 'F', 'B', 'F', 'B'],
 ['G', 'F', 'B', 'F', 'B', 'F', 'B'], ['B', 'F', 'G', 'F', 'B', 'F', 'B'],
 ['B', 'F', 'B', 'F', 'G', 'F', 'B'], ['B', 'F', 'B', 'F', 'B', 'F', 'G'],
 ['B', 'F', 'B', 'F', 'B', 'G', 'F'], ['B', 'F', 'B', 'G', 'B', 'F', 'F'],
 ['B', 'G', 'B', 'F', 'B', 'F', 'F'], ['B', 'B', 'G', 'F', 'B', 'F', 'F'],
 ['B', 'B', 'B', 'F', 'G', 'F', 'F']]

real    0m20.883s
user    0m20.549s
sys    0m0.020s

这是我的代码：

import Queue

fCamel = 'F'
bCamel = 'B'
gap = 'G'

def solution(formation):
    return len([i for i in formation[formation.index(fCamel) + 1:]
                if i == bCamel]) == 0

def heuristic(formation):
    fCamels, score = 0, 0
    for i in formation:
        if i == fCamel:
            fCamels += 1;
        elif i == bCamel:
            score += fCamels;
        else:
            pass
    return score

def getneighbors (formation):
    igap = formation.index(gap)
    res = []
    # AB_CD --> A_BCD | ABC_D | B_ACD | ABD_C
    def genn(i,j):
        temp = list(formation)
        temp[i], temp[j] = temp[j], temp[i]
        res.append(temp)

    if(igap > 0):
        genn(igap, igap-1)
    if(igap > 1):
        genn(igap, igap-2)
    if igap < len(formation) - 1:
        genn(igap, igap+1)
    if igap < len(formation) - 2:
        genn(igap, igap+2)

    return res

class node:
    def __init__(self, a, g, p):
        self.arrangement = a
        self.g = g
        self.parent = p

def astar (formation, heuristicf, solutionf, genneighbors):

    openlist = Queue.PriorityQueue()
    openlist.put((heuristicf(formation), node(formation, 0, None)))
    closedlist = []

    while 1:
        try:
            f, current = openlist.get()
        except IndexError:
            current = None

        if current is None:
            print "No solution found"
            return None;

        if solutionf(current.arrangement):
            path = []
            cp = current
            while cp != None:
                path.append(cp.arrangement)
                cp = cp.parent
            path.reverse()
            return path

        #arr = current.arrangement
        closedlist.append(current)
        neighbors = genneighbors(current.arrangement)

        for neighbor in neighbors:
            if neighbor in closedlist:
                pass
            else:
                openlist.put((current.g + heuristicf(neighbor),
                             node(neighbor, current.g + 1, current)))

        #sorted(openlist, cmp = lambda x, y : x.f > y.f)

def solve(formation):
    return astar(formation, heuristic, solution, getneighbors)

print solve([fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel])
#print solve([fCamel, fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel, bCamel])

这只是针对每个骆驼3个的情况。我想至少做4个骆驼的测试。那个测试案例还在运行中（现在已经过去大约5分钟了 :()。如果它完成了，我会更新这个内容。

我该怎么做才能改善这段代码呢？（主要是提高性能，但其他建议也欢迎）。

代码优化编程技巧性能提升计算效率算法复杂度测试案例运行时间分析

7 个回答

tkerwin说得对，你应该使用集合来作为closedlist，这样会快很多，但对于每边有4只骆驼的情况，速度还是有点慢。下一个问题是，你允许了一些不可能的解，因为你让fCamels可以往回走，而bCamels可以往前走。要解决这个问题，把以下代码：

if(igap > 0):
    genn(igap, igap-1)
if(igap > 1):
    genn(igap, igap-2)
if igap < len(formation) - 1:
    genn(igap, igap+1)
if igap < len(formation) - 2:
    genn(igap, igap+2)

替换成：

if(igap > 0 and formation[igap-1] == fCamel):
    genn(igap, igap-1)
if(igap > 1 and formation[igap-2] == fCamel):
    genn(igap, igap-2)
if (igap < len(formation) - 1) and formation[igap+1] == bCamel:
    genn(igap, igap+1)
if (igap < len(formation) - 2) and formation[igap + 2] == bCamel:
    genn(igap, igap+2)

这样，我在大约0.05秒内就能找到每边4只骆驼的问题的解，而不是10秒。我还试了每边5只骆驼，结果是0.09秒。我也在使用集合作为closedlist，并且用heapq而不是Queue。

额外的加速

你可以通过正确使用启发式方法来进一步加速。目前，你使用的是这行代码：

openlist.put((current.g + heuristicf(neighbor), node(neighbor, current.g + 1, current)))

（或者heapq版本的那一行），但你应该把它改成：

openlist.put((heuristicf(neighbor), node(neighbor, current.g + 1, current)))

这个改动没有考虑到已经需要的移动次数，但没关系。对于这个谜题（以及排除那些让骆驼走错方向的移动），你不需要担心需要多少步——要么一步能让你更接近解，要么就会走到死胡同。换句话说，所有可能的解都需要相同的步数。这个小改动把每边12只骆驼的情况找到解的时间从超过13秒（即使使用了heapq、集合作为closedlist，以及上面提到的找邻居的改动）缩短到了0.389秒。这还不错。

顺便说一下，检查你是否找到了解的一个更好方法是，看第一个fCamel的索引是否等于formation的长度/2 + 1（使用整数除法），并且它前面的索引是否等于间隙。

回答于 2025-04-16 由 Python大师

分享举报

首先让我告诉你怎么找到问题。然后我会告诉你问题出在哪里：

我甚至没有费心去理解你的代码。我只是运行了一下，然后随机取了3个时间点的堆栈样本。我是通过按下控制键+C来实现的，然后查看生成的堆栈跟踪信息。

一种看待这个问题的方法是：如果某个语句出现在X%的随机堆栈跟踪中，那么它在堆栈中大约占据了X%的时间，这就是它的责任所在。如果你能避免执行它，那你就能节省这么多时间。

好的，我取了3个堆栈样本。它们是：

File "camels.py", line 87, in <module>
  print solve([fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel])
File "camels.py", line 85, in solve
  return astar(formation, heuristic, solution, getneighbors)
File "camels.py", line 80, in astar
  openlist.put((current.g + heuristicf(neighbor), node(neighbor, current.g + 1, current)))

File "camels.py", line 87, in <module>
  print solve([fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel])
File "camels.py", line 85, in solve
  return astar(formation, heuristic, solution, getneighbors)
File "camels.py", line 80, in astar
  openlist.put((current.g + heuristicf(neighbor), node(neighbor, current.g + 1, current)))

File "camels.py", line 87, in <module>
  print solve([fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel])
File "camels.py", line 85, in solve
  return astar(formation, heuristic, solution, getneighbors)
File "camels.py", line 80, in astar
  openlist.put((current.g + heuristicf(neighbor), node(neighbor, current.g + 1, current)))

注意，在这个例子中，所有的堆栈样本都是一样的。换句话说，这三行代码几乎都占用了所有的时间。所以看看它们：

line        87: print solve([fCamel, fCamel, fCamel, gap, bCamel, bCamel, bCamel])
line solve: 85: return astar(formation, heuristic, solution, getneighbors)
line astar: 80: openlist.put((current.g + heuristicf(neighbor), node(neighbor, current.g + 1, current)))

很明显，第87行是你无法避免执行的，可能第85行也一样。那就只剩下第80行，也就是openlist.put这个调用。现在，你无法判断问题出在+运算符、heuristicf调用、node调用，还是put调用上。如果你能把这些分开到不同的行上，你就能找出问题所在。

所以我希望你能从中学到一个快速简单的方法，来找出你的性能问题在哪里。

回答于 2025-04-16 由 Python大师

分享举报

我之前也遇到过这个问题。这里的瓶颈其实是 if neighbor in closedlist 这一句。

in 这个语句用起来很简单，但你可能会忘记它是线性搜索。当你在列表上进行线性搜索时，速度会变得很慢，尤其是列表很大的时候。这时你可以把 closedlist 转换成一个 set 对象。这样可以存储它里面的元素的哈希值，所以用 in 操作符查找时会比在列表中快得多。不过，列表是不能哈希的，所以你需要把你的配置改成元组，而不是列表。

如果 closedlist 的顺序对算法很重要，你可以用集合来做 in 操作，同时保留一个并行的列表来存储你的结果。

我尝试了一个简单的实现，包括 aaronasterling 的 namedtuple 技巧，结果在你的第一个例子中运行了 0.2 秒，在第二个例子中运行了 2.1 秒，不过我还没有验证第二个更长的例子的结果。

回答于 2025-04-16 由 Python大师

分享举报

如何提高这段代码的性能？

7 个回答

撰写回答