tensorflowgoogle-colaboratoryreproducible-research

Tensorflow-Keras reproducibility problem on Google Colab


I have a simple code to run on Google Colab (I use CPU mode):

import numpy as np
import pandas as pd

## LOAD DATASET

datatrain = pd.read_csv("gdrive/My Drive/iris_train.csv").values
xtrain = datatrain[:,:-1]
ytrain = datatrain[:,-1]

datatest = pd.read_csv("gdrive/My Drive/iris_test.csv").values
xtest = datatest[:,:-1]
ytest = datatest[:,-1]

import tensorflow as tf
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.utils import to_categorical

## SET ALL SEED

import os
os.environ['PYTHONHASHSEED']=str(66)

import random
random.seed(66)

np.random.seed(66)
tf.set_random_seed(66)

from tensorflow.keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

## MAIN PROGRAM

ycat = to_categorical(ytrain) 

# build model
model = tf.keras.Sequential()
model.add(Dense(10, input_shape=(4,)))
model.add(Activation("sigmoid"))
model.add(Dense(3))
model.add(Activation("softmax"))

#choose optimizer and loss function
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

# train
model.fit(xtrain, ycat, epochs=15, batch_size=32)

#get prediction
classes = model.predict_classes(xtest)

#get accuration
accuration = np.sum(classes == ytest)/len(ytest) * 100

I have read the setup to create a reproducibility code here Reproducible results using Keras with TensorFlow backend and I put all code in the same cell. But the result (e.g. the loss) is always different every time I run that cell (run the cell using shift + enter).

In my case, the result from the code above can be reproduced, if only:

  1. I run using "runtime" > "restart and run all" or,
  2. I put that code in a single file and run it using the command line (python3 file.py)

is there something I miss to make the result reproducible without restart the runtime?


Solution

  • You should also fix the seed for kernel_initializer in your Dense layers. So, your model will be like:

    model = tf.keras.Sequential()
    model.add(Dense(10, kernel_initializer=keras.initializers.glorot_uniform(seed=66), input_shape=(4,)))
    model.add(Activation("sigmoid"))
    model.add(Dense(3, kernel_initializer=keras.initializers.glorot_uniform(seed=66)))
    model.add(Activation("softmax"))