We are running a scheduling engine with docker, chronos & mesos.
Running 2 mesos slaves on each node. Sometimes, too many Jobs gets executed on each node and docker becomes unresponsive and docker gets corrupted on rebooting the server. Is there anything wrong with the setup? Not sure, why docker hangs and gets corrupted on reboot?
Thanks
Check out
--cgroups_root
flag in https://github.com/apache/mesos/blob/master/docs/configuration/agent.md This flag only applies to MesosContainerizer (can be used to launch Docker containers).