[SOLVED] MemoryError: Unable to allocate 30.4 GiB for an array with shape (725000, 277, 76) and data type float64

MemoryError: Unable to allocate 30.4 GiB for an array with shape (725000, 277, 76) and data type float64

It gives that memory error but memory capacity is never reached. I have 60 GB of ram on the SSH and the full dataset process consumes 30 I am trying to train an autoendcoder with k-fold. Without k-fold the training works fine. The raw dataset contains 250,000 data in hdf5. With K-fold it works if I use less than 100000 of total data. I have converted it to float32 but still does not work. I have tried echo 1 as well but that kill the python program automatically

Solution

Taking into account the dimensions of the dataset you provided (725000 x 277 x 76) and its data type (float64 - 8 bytes), it seems that you need (at minimum) around 114 GB to have the dataset loaded/stored in RAM.

A solution to overcome this limitation is to: 1) read a certain amount of the dataset (e.g. a chunk of 1 GB at the time) through a hyperslab selection and load/store it in memory, 2) process it, and 3) repeat the process (i.e. go to step 1) until the dataset is completely processed. This way, you will not run out of RAM memory.