hadoopspring-boothadoop-yarnspring-data-hadoop

how to pass parameters from web requests to spring boot yarn application


I'm using spring-boot and spring-boot-yarn to submit yarn applications to a cluster.

My use-case is close to the one described in this tutorial https://github.com/spring-guides/gs-yarn-basic.

The only difference is that my 'client' is supposed to be a web application and submit the yarn jobs when web requests are made.

The problem I have is that web requests to the 'client' web-application provide parameters I need to pass down to the yarn job.

In the above tutorial parameters are passed as command line arguments to to the appmaster / container specified in application.yml. In my case this approach does not work since I have a different set of parameters for each yarn job.

Is there a way to pass dynamic parameters to yarn jobs without hard-coding them in application.yml?


Solution

  • Original idea was to prevent "rogue" users or applications to pass properties which would then automatically end up in a command-line options potentially making harm within a hadoop cluster.

    It's worth to check my answer in Spring Boot Yarn - Passing Command line arguments if this is what you want.

    Having said that, you are not a first person to ask this or "complain" that it is too difficult or unclear how to do it. We're going to make this much easier with future releases mostly because it just seem to be what users want to do.