google-cloud-platformgcp-ai-platform-notebook

Memory error in Google Cloud Platform AI Jupyter notebook


I am trying to run a sentiment analysis on Google Cloud Platform (AI Platform). When I try to split the data into training, Its showing memory error like below error like below

MemoryError: Unable to allocate 194. GiB for an array with shape (414298,) and data type <U125872

How do I increase the memory size accordingly? Should I change the machine type in the instance? If so Which setting would be appropriate?


Solution

  • From the error it's seems the VM is out of memory.

    1 - Create a new Notebook with another Machine type. For this, go to AI Platform > Notebooks and click on NEW INSTANCE. Select the option most fit you (R 3.6, Python 2 and 3, etc.) and click on ADVANCED OPTIONS in the popped pane. In the Machine Configuration area you can pick a Machine type with more memory.

    Please start with n1-standard-16 or n1-highmem-8, and if any of those works, jump to n1-standard-32 or n1-highmem-16.

    Using the command you also can change the machine size:

    gcloud compute instances set-machine-type INSTANCE_NAME \
        --machine-type NEW_MACHINE_TYPE
    

    2 - Change the dtype. If you are working with np.float64 type, you can change it to np.float32 in order to reduce size. So you can modify the line: result = np.empty(self.shape, dtype=dtype) By: result = np.empty(self.shape, dtype=np.float32)

    If you don't want to modify your code I suggest you to follow first option.