I want to add a security measure to my spark job, if they don't finish after X hours kill them selves. (using spark 2.4.3 in cluster mode in yarn mode)
Didn't find any configuration in spark that helps me with what I wanted .
I tried to do it this way:
val task = new java.util.TimerTask {
def run():Unit = {
val p = Runtime.getRuntime.exec(Array[String]("/bin/sh", "-c", s"yarn application -kill ${sc.applicationId}")) // this code can only run on the cluster
p.waitFor()
sc.stop()
}
}
timeoutProcess.schedule(task, X) // where X is 10000 for 10s for testing
But doesn't seem to accomplish killing the application, would appreciate any idea or thought regarding this.
Tried to look around but didnt find any good idea.
The proper way to set a timeout for a job is via yarn.
Check this hadoop Jira
You can either use cli like :
yarn application -appId <your app id> -updateLifetime 3600
This will kill your application after 3600 seconds
from the time you run this command.
Or you can use Rest end point to update this as well.