mesosmesos-chronos

Apache Chronos Architecture Explaination


I was trying to see what makes Chronos better than Crons? I am not able to understand its job scheduling and executing architecture completely.

Specifically, these are the questions around chronos architecture that are not clear to me.

  1. In one of the Chronos documentation I read that since crons has SPoF, crons are bad and cronos is better. How chronos avoids SPoF?
  2. Where are job schedules saved in Chronos? Does it maintain some sort of DB for that?
  3. How scheduled jobs are triggered, who sends an event to Chronos to trigger the job?
  4. Are dependent jobs triggered by chronos, if yes how chronos even know when the parent job is completed? Can it distinguish failed jobs from completed ones?
  5. I saw that jobs in chronos are defined using Json format, any reason for using JSON and not any other format like YAML, Apache Config etc.
  6. Can a job in chronos have multiple commands? If yes will all these different commands be executed on same machine in cluster or Chronos can even launch different commands in a job in different machines in cluster? Can these multiple commands inside job be launched in parallel?
  7. If mesos already has a scheduling capability then why Chronos is even required? Can Chronos run without Mesos?
  8. Does Chronos support event based scheduling? For example run my job when file 'x' is created etc.
  9. What does async run of a job mean in Chronos?

Anyone have a good reference for understanding the architecture of Chronos?


Solution

  • Some of your questions are answered in my reply here so I will focus on the other questions not addressed.

    1. Chronos stores state in memory unless you are using Zookeeper, in which case it is stored in Zookeeper at /chronos/state by default reference here.

    2. See: Chronos: How does it work?

    3. Based on lastsuccess and lastfailure seen here

    4. Because the authors decided to use JSON and a RESTful API

    5. Yes. The use of && or bash scripts... They will all be executed on the same machine that the job is running on. No, single jobs cannot run the commands in parallel, but multiple jobs could be scheduled at the same time.

    6. Because Chronos is for short-lived cron jobs that can be scheduled on a regular basis, whereas Marathon is for long-lived tasks. The reason Chronos is a good replacement for cron is that it is wholly dependent on Mesos - which means you can also use Mesos attributes to schedule jobs around your Mesos cluster appropriately. See here and here

    7. Nope.

    8. The state of async jobs is suspect, it looks like it was removed but still unfortunately has some references in the documentation.