How would I convert the output of a keras model into a chess move?

My model was trained off of thousands of FENs which are simply chess positions and the move that was played as a response to that chess position. The model should output a predicted move in response to an FEN.

This is the code that I used to train the model:

def train(X, y):
    # Split the data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.001, random_state=42)
    
    # Preprocess the data (you may need to implement FEN2ARRAY function for neural network input)
    X_train_processed = FEN2ARRAY(X_train)
    X_test_processed = FEN2ARRAY(X_test)
    
    # Encode the target variable (moves) for categorical classification
    label_encoder = LabelEncoder()
    label_encoder.fit(y_train)
    y_train_encoded = label_encoder.transform(y_train)
    y_test_encoded = label_encoder.transform(y_test)
    num_classes = len(label_encoder.classes_)
    
    # Convert target variable to one-hot encoded format
    y_train_categorical = to_categorical(y_train_encoded, num_classes=num_classes)
    y_test_categorical = to_categorical(y_test_encoded, num_classes=num_classes)
    
    # Define the neural network architecture
    model = Sequential()
    model.add(Dense(128, activation='relu', input_shape=(64,)))  # Adjust input_shape based on your input data
    model.add(Dense(64, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))  # Use softmax activation for multi-class classification
    
    # Compile the neural network model
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    
    # Train the neural network model
    model.fit(X_train_processed, y_train_categorical, epochs=10, batch_size=32, validation_data=(X_test_processed, y_test_categorical))
    
    return model

But when I output the prediction here:

prediction = model.predict(FEN2ARRAY("8/8/6p1/5k1p/4Rp1P/5P2/p5PK/r7 w - - 8 48"))
print(prediction)

I get this output:

1/1 [==============================] - 0s 83ms/step
[[9.8108569e-05 2.6935700e-04 9.9022780e-04 ... 2.1389520e-03
  1.9414679e-04 1.4036804e-03]]

How would I convert that output to a chess move?

Solution

It's important to understand what you're doing, or at least what ChatGPT is trying to do

Let's break down the code starting with this part:

####### Encode the target variable (moves) for categorical classification
label_encoder = LabelEncoder()
label_encoder.fit(y_train)

What you're doing here is encoding your training target values (which are the moves you're teaching the network to predict). This means you're creating an "index" where each different move that appears in your training data is assigned an integer value. For example:

index = {0: "e4", 1: "Nc3", 2: "d4", ...}

Based on this, you are trying to teach the network to predict which of the moves seen is the best for the current game state, with the output of the model being a probability vector, where each position corresponds to a move from the index I mentioned earlier. Do you understand now?

Some additional points to consider:

If you continue with this approach, make sure to save this label_encoder along with your model. You will need it to interpret the predictions later, and be able to tell which move it corresponds to.
To get the predicted move, you will need to find the index with the highest probability and then use the label_encoder to convert it back to a chess move.
Note that this approach can only predict moves it has seen in the training data. It will not be able to generate new moves that are not in the training set.
The quality of your predictions will depend greatly on the quality and quantity of your training data, as well as the architecture of your network and how you train it.

Example func:

import numpy as np
def predict_chess_move(model, fen, label_encoder):
 # Convert the FEN to the input of the model
 model_input = FEN2ARRAY(fen)

 # Make the prediction
 prediction = model.predict(model_input)

 # Find the index with the highest probability
 move_index = np.argmax(prediction)

 # Convert the index back to a chess move
 predicted_move = label_encoder.inverse_transform([move_index])[0]

 return predicted_move

# Function usage
fen = "8/8/6p1/5k1p/4Rp1P/5P2/p5PK/r7 w - - 8 48"
predicted_move = predict_chess_move(model, fen, label_encoder)
print(f"The predicted move en: {predicted_move}")

I hope you find it useful. When you don't fully understand something, you can always ask ChatGPT to explain it to you, and in different ways, until you find something that works for you ;)