I have multiple unrelated jobs running on Spark/Hadoop grid executors started by a single spark-submit from the driver node. When I need to stop the jobs, I'd like to save their IDs before the driver node program quits (or, more generally: to do something like save the state and/or clean up resources, it doesn't matter).
Is there a way to handle a yarn termination event to perform such operations as you would do it by analogy with POSIX signals by handling SIGTERM signal (& friends) to perform some last chance clean up.
I haven't found a way to handle yarn kill. Is there a way? Is there an alternative to yarn kill that would satisfy that need?
Thanks. Regards.
Use a Spark listener with onApplicationEnd to handle the yarn kill event:
sparkContext.addSparkListener(new SparkListener {
override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
}
}
Apache Ignite (in the version I use) does exactly that for that purpose.