I am using google colab on a dataset with 4 million rows and 29 columns. When I run the statement sns.heatmap(dataset.isnull()) it runs for some time but after a while the session crashes and the instance restarts. It has been happening a lot and I till now haven't really seen an output. What can be the possible reason ? Is the data/calculation too much ? What can I do ?
I'm not sure what is causing your specific crash, but a common cause is an out-of-memory error. It sounds like you're working with a large enough dataset that this is probable. You might try working with a subset of the dataset and see if the error recurs.
Otherwise, CoLab keeps logs in /var/log/colab-jupyter.log
. You may be able to get more insight into what is going on by printing its contents. Either run:
!cat /var/log/colab-jupyter.log
Or, to get the messages alone (easier to read):
import json
with open("/var/log/colab-jupyter.log", "r") as fo:
for line in fo:
print(json.loads(line)['msg'])