I am currently attempting to make a minimax algorithm for connect 4 (6x7 board) in Python. I already made a similarly structured function for tic-tac-toe that works fine but whenever I run the connect four minimax method my computer stalls for a minute or two before I kill the function. I know alpha-beta pruning is supposed to make minimax work better but I want to master minimax before implementing pruning. Is it a problem with my code, or do I have to implement alpha-beta pruning for it to work?
Note: I've already tested the boardWin and MakeMove methods so I'm pretty sure its not that
import math
class reproduce_mm:
def __init__(self):
self.evals = [" ", " ", " ", " ", " ", " ", " "] #stores minimax values for each move calculated by minimax
def minimax_c4(self, board, computer, move): #minimax algo
#inputs: board = 6x7 2d array with x's and o's, computer = x or o, move = whose turn it is (x or o)
player = "o" if computer == "x" else "x"
next = "o" if move == "x" else "x"
count = 1 #count is number of free spaces left in c4 board + 1
for x in range(len(board)):
for y in range(len(board[x])):
if (board[x][y] == None):
count = count + 1
for column in range(len(board[0])):
if (self.emptyCol(board, column) == False):
self.evals[column] = 0
winner = self.boardWin(board)
if (winner == computer): #return -count if winner, -count if loser
return count
elif (winner == player):
return count * -1
elif (winner == "draw"):
return 0
else:
for eval in range(len(self.evals)):
if (self.evals[eval] == 0):
temp = board.copy()
temp = self.makeMove(temp, move, eval)
self.evals[eval] = self.minimax_c4(temp, computer, next)
self.clearBoard(eval, board)
#clears board of previous move in temp (Tried copy.deepcopy but not sure if it works)
if (move == computer):
max_val = -math.inf
for eval in range(len(self.evals)):
if (self.evals[eval] != " "):
if (self.evals[eval] > max_val):
max_val = self.evals[eval]
return max_val
if (move == player):
min = math.inf
for eval in range(len(self.evals)):
if (self.evals[eval] != " "):
if (self.evals[eval] < min):
min = self.evals[eval]
return min
def clearBoard(self, c, board):
for r in range(len(board)):
if (board[r][c] != None):
board[r][c] = None
return
return
def makeMove(self, board, move, c):
first = True
while (first == True):
for r in reversed(range(len(board))):
if (first == True):
if (board[r][c] == None):
board[r][c] = move
first = False
return board
def emptyCol(self, board, c):
full = True
for r in reversed(range(len(board))):
if (board[r][c] == None):
full = False
return full
def chooseMove(self): #chooses move by looking at max value from self.evals
max = None
index = None
for ev in range(len(self.evals)):
if (self.evals[ev] != " "):
if (max == None):
max = self.evals[ev]
index = ev
elif (max != None):
if (ev > max):
max = self.evals[ev]
index = ev
return index
def boardWin(self, board):
full_board = True
winner = None
switched = [[None]*6 for _ in range(7)]
for r in range(len(switched)): # flips board sideways to check if columns are 3 in a row
for c in range(len(switched[r])):
switched[r][c] = board[c][r]
for row in range(len(board)):
for item in board[row]:
if (item == None):
full_board = False
for x1 in range(len(board)):
if (self.across_win(board[x1]) != None):
winner = self.across_win(board[x1])
return winner
for y1 in range(len(board[x1])):
if (self.diag_win(board, x1, y1) != None):
winner = self.diag_win(board, x1, y1)
return winner
for x2 in range(len(switched)):
if (self.across_win(switched[x2]) != None):
winner = self.across_win(switched[x2])
return winner
if (full_board == True and winner == None):
return "draw"
return None
def across_win(self, row, counter = 0, winner = ""):
if (counter == 4):
return winner
elif (len(row) == 0):
return None
elif (counter == 0):
if (row[0] != None):
winner = row[0]
counter = counter + 1
return self.across_win(row[1 : len(row)], counter, winner)
else:
if (row[0] == winner):
counter = counter + 1
return self.across_win(row[1 : len(row)], counter, winner)
else:
if (row[0] != None):
return self.across_win(row[1 : len(row)], counter = 1, winner = row[0])
else:
return self.across_win(row[1 : len(row)], counter = 0, winner = "")
def diag_win(self, board, r, c):
if (r < 3 and c < 4):
if (board[r][c] != None):
if (board[r][c] == board[r + 1][c + 1] == board[r + 2][c + 2] == board[r + 3][c + 3]):
return str(board[r][c])
if (r < 3 and c > 2):
if (board[r][c] != None):
if (board[r][c] == board[r + 1][c - 1] == board[r + 2][c - 2] == board[r + 3][c - 3]):
return str(board[r][c])
return None
board = [[None, None, None, None, None, None, None],
[None, None, None, None, None, None, None],
[None, None, None, None, None, None, None],
["x", "y", None, None, None, None, None],
["x", "x", None, None, None, None, None],
["x", "y", "y", "y", None, None, None]]
ai = reproduce_mm()
print(ai.minimax_c4(board, "x", "x"))
ai.makeMove(board, "x", 0)
print(str(ai.chooseMove()))
print(board)
Is it a problem with my code, or do I have to implement alpha-beta pruning for it to work?
It is a problem with the number of game states that you want to evaluate. Your minimax function is designed to keep playing moves, recursing until the end of the game is reached and so all possible (4,531,985,219,092) games are played: even worse, as the same game states are visited multiple times by exchange of moves (1. e1 d1 2. c1
leads to the same state as 1. c1 d1 2. e1
), your code actually intends to visit an astronomically larger number of end states.
Here are some measures you can take to improve:
Set a maximum depth to your recursion. When it is reached evaluate the board without playing more moves: this would be a heuristic that "guesses" what a good estimated value would be (think of counting the number of 3-in-a-rows that can still become 4, recognise winning patterns, ...).
Use memoization to see if you have evaluated a game state before, so to avoid doing the same job again. Think of an efficient way to identify a game state uniquely (think in terms of bits and column heights). Also realise that the result of the game is the same when you mirror it (left - right), so if you evaluated a game state, also its mirror is evaluated.
Use an opening database. You can find these online: they list positions with a fixed number of discs (for example: all possible positions after 8 plies) with the theoretical outcome (win, loss, draw). Then make that part of your memoization table.
The heuristic function could use the concept of zugzwang to predict the outcome (like "In this column I will always play on top of the opponent" or "if they play in column A, I'll play in column B, or vice versa", ...): see Victor Allis' thesis: "A Knowledge-based Approach of Connect-Four".
Alpha-beta pruning: you are right that this can make the search tree narrower, but it isn't the most important optimisation at this point, and wont help much when you don't take any of the other measures.
Iterative deepening: once you have implemented alpha-beta pruning, its effect will be greater if you visit moves in the order of how good they are. To know how good they are you could first perform the search with a lesser depth, then sort the moves according to that evaluation, and only then do the search with the deeper depth. You can do this iteratively: first search with depth 2, then 4, then 6... You'll have to reset memoization wisely.
There are obviously more techniques than this.