kerastheanoconv-neural-networkkeras-layer

Memory error in keras while implementing CNN


I have created the model as below(keras with theano backend). When I run it on my CPU it gives me memory error. I have 8GB DDR3 ram and before calling model1.fit my ram is 2.3 GB consumed. Also I could the RAM being used upto 7.5GB and the program crashes. I also tried it running on GPU (Nvedia GeForce GTX 860M) 4GB but still I got a memory error.

def get_model_convolutional():
    model = keras.models.Sequential()
    model.add(Conv2D(128, (3, 3), activation='relu', strides = (1,1), input_shape=(1028, 1028, 3)))
    model.add(Conv2D(3, (3, 3), strides = (1,1), activation=None))
    sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='categorical_crossentropy', optimizer=sgd)
    return model

if __name__ == "__main__":
    model1 = get_model_convolutional()
    train_x = np.ones((108, 1208, 1208, 3), dtype=np.uint8)
    train_y = np.ones((108, 1204, 1204, 3), dtype = np.uint8)    
    model1.fit(x_train, y_train, verbose = 2,epochs=20, batch_size=4)

Also the output when I try to print model.summary() is

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 1026, 1026, 128)   3584      
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 1024, 1024, 3)     3459      
=================================================================
Total params: 7,043
Trainable params: 7,043
Non-trainable params: 0
_________________________________________________________________

Why is so much of memory is required? I tried to calculate but I think memory around 1.5GB should be required. This is my first model.


Solution

  • The memory was required to compute the intermediate outputs, which get very huge in this case because of no pooling. Some solutions are to reduce the number of filters, reduce the image size (i.e. use cropped images and later stack them together), reduce batch size.