2048游戏中的Alpha Beta问题

6 投票
1 回答
1017 浏览
提问于 2025-04-18 10:01

我正在用Python为2048游戏写一个人工智能。进展比我预想的要慢很多。我把搜索的深度限制设置为5,但还是花了好几秒才得到答案。起初我以为是我实现的所有函数都不好,但后来我发现真正的问题所在。搜索树上的叶子节点比我想象的要多得多。

这是一个典型的结果(我统计了叶子节点、分支和扩展的数量):

111640 leaves, 543296 branches, 120936 expansions
Branching factor: 4.49242574585
Expected max leaves = 4.49242574585^5 = 1829.80385192 leaves

再来一个,作为补充:

99072 leaves, 488876 branches, 107292 expansions
Branching factor: 4.55650001864
Expected max leaves = 4.55650001864^5 = 1964.06963743 leaves

如你所见,搜索树上的叶子节点数量远远超过了如果我使用简单的最小最大算法时的数量。这是怎么回事呢?我的算法代码如下:

# Generate constants
import sys
posInfinity = sys.float_info.max
negInfinity = -sys.float_info.max

# Returns the direction of the best move given current state and depth limit
def bestMove(grid, depthLimit):
    global limit
    limit = depthLimit
    moveValues = {}
    # Match each move to its minimax value
    for move in Utils2048.validMoves(grid):
        gridCopy = [row[:] for row in grid]
        Utils2048.slide(gridCopy, move)
        moveValues[move] = minValue(grid, negInfinity, posInfinity, 1)
    # Return move that have maximum value
    return max(moveValues, key = moveValues.get)

# Returns the maximum utility when the player moves
def maxValue(grid, a, b, depth):
    successors = Utils2048.maxSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = negInfinity
    for successor in successors:
        value = max(value, minValue(successor, a, b, depth + 1))
        if value >= b:
            return value
        a = max(a, value)
    return value
# Returns the minimum utility when the computer moves
def minValue(grid, a, b, depth):
    successors = Utils2048.minSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = posInfinity
    for successor in successors:
        value = min(value, maxValue(successor, a, b, depth + 1))
        if value <= a:
            return value
        b = min(b, value)
    return value

有人能帮我一下吗?我已经仔细看过这段代码好几遍了,就是找不到问题出在哪里。

1 个回答

0

看起来你在把 valueb(beta)以及 a(alpha)进行比较。你代码中的比较方式如下:

def maxValue(grid, a, b, depth):
    .....
    .....
        if value >= b:
            return value
        a = max(a, value)
    return value

而且

def minValue(grid, a, b, depth):
    .....
    .....
        if value <= a:
            return value
        b = min(b, value)
    return value

不过,进行alpha-beta剪枝的条件是:只要alpha大于beta,也就是 alpha > beta,我们就不需要继续遍历搜索树了。

所以,应该是:

def maxValue(grid, a, b, depth):
    ....
    .....
        a = max(a, value)
        if a > b:
            return value

    return value

而且

def minValue(grid, a, b, depth):
    .....
    .....
        b = min(b, value)
        if b < a:
            return value

    return value

这是你遗漏的一个特殊情况,因为 a(alpha)和 b(beta)并不总是等于 value

撰写回答