[SOLVED] Why does my Kaggle cloud notebook give cuda init error: UNKNOWN ERROR (303)?

Why does my Kaggle cloud notebook give cuda init error: UNKNOWN ERROR (303)?

I am running all of this on a Kaggle cloud notebook (so it is a cloud environment) but I am not confident that it is configured correctly because I'm getting a CUDA initialization error.

The answer at tensorflow (not tensorflow-gpu): failed call to cuInit: UNKNOWN ERROR (303) did not resolve my issue, so I suspect it is Kaggle-related.

Here is the code that is used to load the relevant packages along with functions.

import os, json, joblib
from pathlib import Path
import warnings 
warnings.filterwarnings("ignore")


from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.utils.class_weight import compute_class_weight

from tensorflow.keras.utils import Sequence, to_categorical, pad_sequences
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.layers import (
    Input, Conv1D, BatchNormalization, Activation, add, MaxPooling1D, Dropout,
    Bidirectional, LSTM, GlobalAveragePooling1D, Dense, Multiply, Reshape,
    Lambda, Concatenate, GRU, GaussianNoise
)
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras import backend as K
import tensorflow as tf
import polars as pl
from sklearn.model_selection import StratifiedGroupKFold
from scipy.spatial.transform import Rotation as R

Here is the code that is used to load the weights into a model.

PRETRAINED_DIR = Path("/kaggle/input/gesture_two_branch_mixup.h5/tensorflow2/default/1")

model = load_model(PRETRAINED_DIR / "gesture_two_branch_mixup.h5",
                       compile=False, custom_objects=custom_objs)

Here is the error message.

2025-07-06 23:21:43.497389: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)

When I try to load a model with tensorflow.keras load_model, I get the following error code Failed call to cuInit: UNKNOWN ERROR (303).

Relevant information includes :

tensorflow                         2.18.0

print(tf.sysconfig.get_build_info()['cuda_version'])

Output of the above code (CUDA version):

12.5.1

What should I do in this case?

I tried using the load_model function (as you can see) and got the UNKNOWN ERROR (303). I expected things to run without any error or warning.

Solution

You must configure your Kaggle kernel to use a GPU. There's basically a checkbox for Enable GPU that you must enable, otherwise you will get the CUDA initialization (cuInit) error you see.

Detailed instructions are available at https://www.kaggle.com/code/dansbecker/running-kaggle-kernels-with-a-gpu