c++artificial-intelligencealpha-beta-pruningnegamax

C++ Negamax alpha-beta wrong cutoff?


I've been using negamax to play connect four. The thing I noticed is, that if I add alpha-beta it gives sometimes "wrong" results, as in making a losing move I don't believe it should make with the depth I am searching at. If I remove the alpha-beta it plays how it is supposed to. Can the alpha-beta cut off some actually viable branches(especially when the depth is limited)? Here's the code just in case:

int negamax(const GameState& state, int depth, int alpha, int beta, int color)
{
    //depth end reached? or we actually hit a win/lose condition?
    if (depth == 0 || state.points != 0)
    {

        return color*state.points;
    }

    //get successors and optimize the ordering/trim maybe too
    std::vector<GameState> childStates;
    state.generate_successors(childStates);
    state.order_successors(childStates);

    //no possible moves - then it's a terminal state
    if (childStates.empty())
    {
        return color*state.points;
    }
    int bestValue = -extremePoints;
    int v;
    for (GameState& child : childStates)
    {
        v = -negamax(child, depth - 1, -beta, -alpha, -color);
        bestValue = std::max(bestValue, v);
        alpha = std::max(alpha, v);
        if (alpha >= beta)
            break;
    }
    return bestValue;
}

Solution

  • Can the alpha-beta cut off some actually viable branches(especially when the depth is limited)?

    Alpha-Beta algorithm returns the same results as Minimax (evaluation at root node and line of play) but (often) in faster time pruning away branches that cannot possibly influence the final decision (you can read a proof in Analysis of the alpha-beta pruning algorithm by Samuel by H. Fuller - 1973).

    You're using Negamax Alpha-Beta pruning but it's just a variant to simplify the implementation of the algorithm.

    Also the fail-soft gimmick doesn't change the situation.

    Of course a search at shallow depth can pick bad moves but the same holds for Minimax.

    So it has to be an implementation error.

    The code shown seems right to me. You should check:

    1. the way you call negamax at the root node. It should be something like:

       negamax(rootState, depth, −extremePoints, +extremePoints, color)
      

      alpha / beta are the lowest and highest values possible.

      If you are using different initial values for alpha / beta (e.g. aspiration windows) and the true score is outside the initial windows, you need to re-search.

    2. how you collect / store / manage / propagate the moves of the principal variation (the relevant code is missing). Techniques like PV-tables are linked to the changes of bestValue. If this is the problem you should get the same score for the position (with respect to Minimax) but a different best move.