[SOLVED] keras 2.4. producing completely different output than 2.3.1

keras 2.4. producing completely different output than 2.3.1

I am trying to implement autoencoders. Using MNIST dataset, I am first encoding the images and then decoding them. When I use keras version 2.3.1 I am getting decoded Images very close to original but on using Keras 2.4.3 and no change to code I am getting completely different output with decoded images close to garbage. I tried finding reasons but couldn't find any nor there was any documentation or article about how to migrate from 2.3.1 to 2.4.3.

This is the output with keras 2.3.1

Output with keras 2.4.3

you can find code here in google colab or down, please note that google collab uses Keras 2.3.1

import keras
from keras.layers import Input, Dense 
from keras.models import Model
import  numpy as np

input_img = Input(shape=(784,)) #input layer
encoded = Dense(32, activation="relu")(input_img) # encoder 
decoded = Dense(784, activation='sigmoid')(encoded) # decocer, output

# defining autoenoder model
autoencoder = Model(input_img, decoded) # autoencoder = encoder+decoder

# defining encoder model
encoder = Model(input_img, encoded) # takes input images and encoded_img


# defining decoder model
encoded_input = Input(shape=(32,))
decoded_layer = autoencoder.layers[-1](encoded_input)
decoder = Model(encoded_input, decoded_layer)

autoencoder.compile(optimizer = 'adadelta', loss='binary_crossentropy')

# Test on images
from keras.datasets import mnist
(x_train, _), (x_test, _) = mnist.load_data()

# Normalize the value between 0 and 1 and flatten 28x28 images in to vector of 784
x_train  = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255
# reshaping (60000, 28,28) -> (60000, 784)
x_train = x_train.reshape(len(x_train), np.prod(x_train.shape[1:]))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

autoencoder.fit(x_train, x_train, epochs=50, batch_size=200 )

encoded_img = encoder.predict(x_test)
decoded_img = decoder.predict(encoded_img)
encoded_img.shape, decoded_img.shape

# Performing Visualization
import matplotlib.pyplot as plt
n = 10
plt.figure(figsize=(40, 8))
for i in range(n):
    plt.subplot(2, n, i+1)
    plt.imshow(x_test[i].reshape(28, 28))

    # Recontructed Imgae
    plt.subplot(2, n, n+i+1)
    plt.imshow(decoded_img[i].reshape(28, 28)) 
plt.show()

Any suggestion?

Solution

It looks like the default learning-rate of Adadelta optimizer in Keras is 1.0 while it is 0.001 in tf.keras. When you switch to tf.keras, the learning-rate of Adadelta is too small that the network does not learn anything. You can change the learning rate as follows before you compile your model and you will get the same behavior in tf.keras as in keras.

opt = tf.keras.optimizers.Adadelta(learning_rate=1.0)
autoencoder.compile(optimizer = opt, loss='binary_crossentropy')