pythonminimax

Minimax function for board game


def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    def max_value(board, depth):
        if utils.is_game_over(board) or depth == max_depth: #if game is over or depth is max_depth return current score
            return evaluate(board), None  # Return v along with action
        v = float('-inf')
        best_action = None  # Initialize best action
        for action in generate_valid_moves(board): #for each action in valid moves
            next_board = utils.state_change(board, action[0], action[1], False) #generate next state
            next_board = utils.invert_board(next_board, False) #invert the board for the white turns(because all the function is applied for black), so invert board will flip the board and turn white to black
            score, _ = max_value(next_board, depth + 1) #get the score for the next state
            if score > v:
                v = score
                best_action = action  # Update best action
        return v, best_action

    return max_value(board, depth)

I have try to implement a minmax algo to get the optimal movement for black or for white with the max depth. As i run it with my testcase, it just return the first action instead of the optimal action which result in the highest evaluation among all. I cant figure out the problem in my code..

def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    if depth == max_depth or utils.is_game_over(board):
        return evaluate(board), None

    # Determine the best move and its evaluation
    if is_black:
        best_evaluation = float('-inf')
        best_move = None
        for action in generate_valid_moves(board):
            new_board = utils.state_change(board, action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(new_board, depth + 1, max_depth, False)
            if opponent_evaluation > best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])
        return best_evaluation, best_move
    else:
        best_evaluation = float('inf')
        best_move = None
        for action in generate_valid_moves(utils.invert_board(board, in_place=False)):
            new_board = utils.state_change(utils.invert_board(board, in_place=False), action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(utils.invert_board(new_board, in_place=False), depth + 1, max_depth, True)
            if opponent_evaluation < best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])  # Convert from black's perspective to white's
        return best_evaluation, best_move

I just try a different approach where switch cases when each turn of black and white, by using the internal implementation invert table. So it basically pass the public test case for the solution, but it encounter issues that state that "Make sure that 'evaluate' is called with the correct input, especially for white" which i dont understand why? I thought my evaluate correctly evaluate for white and black as I have inverted the board after white turn.


Solution

  • Assuming the functions you call are all correct (like invert_board, ...), there is still this one problem:

    Although the board is inverted, you've not inverted the score. What is good for white is bad for black, and vice versa, so you should invert the score.

    I'd change this:

    score, _ = max_value(next_board, depth + 1)
    

    to this:

    score = -max_value(next_board, depth + 1)[0]
    

    Some unrelated comments: