Having trouble with the logic of my minimax function. All other helper functions are tested and performing properly. When the AI of my tic-tac-toe game reaches the minimax function, the error I get is "TypeError: '\<' not supported between instances of 'tuple' and 'float'"
, so I'm guessing it's an issue with comparing v and the new_action tuple that is returned? So far unable to diagnose the issue in the logic of minimax. Min player O has a utility of -1, 0 for a tie, and max player X has a utility of 1. ("state"
is the 2D array of the game board)
def minimax(state):
def max_value(state):
if terminal(state): return utility(state)
v = float('-inf')
new_action = None
for action in actions(state):
temp = max(v, min_value(result(state, action)))
if temp > v:
v = temp
new_action = action
return new_action
def min_value(state):
if terminal(state): return utility(state)
v = float('inf')
new_action = None
for action in actions(state):
temp = min(v, max_value(result(state, action)))
if temp < v:
v = temp
new_action = action
return new_action
if terminal(state): return None
if player(state) == X: max_value(state)
else: min_value(state)
I've looked into the logic of minimax functions and pseudocode but haven't been able to make progress.
There are a few issues:
The final if..else
statement in minimax
doesn't return anything, so minimax
always returns None
There is a mix up of return value for min_value
and max_value
. On the one hand, the expression max(v, min_value(result(state, action)))
expects min_value
to return a value (score), while on the other hand, return new_action
returns the action (move) -- which the ultimate caller of minimax
would also like to get.
The solution is to have min_value
and max_value
return a pair so the caller has both the value (score) and the action (move). For that reason I would also rename these functions to minimize
and maximize
(as they don't only return the value)
Note that there is no need to call max
or min
where you have it. Also, there is no need to check for a terminal state in the main body of the minimax
function, as this is the first check that the inner functions make anyhow.
def minimax(state):
def maximize(state):
if terminal(state):
return (utility(state), None)
best = (float('-inf'), None) # Pair of value and action
for action in actions(state):
temp = minimize(result(state, action)) # We get a pair
if temp > best:
best = (temp, action)
return best # Return the pair of value and action
def minimize(state):
if terminal(state):
return (utility(state), None)
best = (float('inf'), None) # Pair of value and action
for action in actions(state):
temp = maximize(result(state, action)) # We get a pair
if temp < best: # Compare pairs by value first
best = (temp, action)
return best # Return the pair of value and action
# Choose which function to call
optimize = (minimize, maximize)[player(state) == X]
return optimize(state)[1] # Extract the move from the pair and return it
If action
objects are not comparable, then compare the first pairmembers only, like if temp[0] < best[0]
.
The statement to set optimize
is short for:
if player(state) == X:
optimize = maximize
else:
optimize = minimize
As functions are objects like other types are, you can just assign them and then call them (as optimize
references a function).
Without that extra name optimize
you'd do:
if player(state) == X:
return maximize(state)[1]
else:
return minimize(state)[1]