restapache-sparkspring-batchjob-schedulingspring-data-hadoop

Triggering spark jobs with REST


I have been of late trying out apache spark. My question is more specific to trigger spark jobs. Here I had posted question on understanding spark jobs. After getting dirty on jobs I moved on to my requirement.

I have a REST end point where I expose API to trigger Jobs, I have used Spring4.0 for Rest Implementation. Now going ahead I thought of implementing Jobs as Service in Spring where I would submit Job programmatically, meaning when the endpoint is triggered, with given parameters I would trigger the job. I have now few design options.

Firstly, I would like to know what is the best solution in this case, execution wise and also scaling wise.

Note : I am using a standalone cluster from spark. kindly help.


Solution

  • Just use the Spark JobServer https://github.com/spark-jobserver/spark-jobserver

    There are a lot of things to consider with making a service, and the Spark JobServer has most of them covered already. If you find things that aren't good enough, it should be easy to make a request and add code to their system rather than reinventing it from scratch