hadoopspring-data-hadoop

Spring Yarn @OnContainerStart - how to invoke Mapper?


i'm using the Spring Yarn package with Spring Boot and i'm trying to figure out how i can start a Mapper from the @OnContainerStart event. how do i pass arguments to the mapper? how do i configure which mapper/reducer to use? i'm trying to follow this guide

thanks


Solution

  • I believe you're trying to create a simple Apache Hadoop MapReduce application and Spring YARN is not meant for that.

    To develope MapReduce jobs using Spring you can check our reference documentation which can be found from Spring for Apache Hadoop

    Spring YARN is a framework to develope applications which can then run atop of Apache Hadoop YARN, not atop of MapReduce application. It is easy to misunderstand this because Hadoop YARN is too broadly used as a synonym for Apache Hadoop MapReduce V2. New MapReduce V2 is actually a simple YARN application running on YARN which is a Hadoop's new resource scheduling framework.

    Having said that, if you want to run something totally different than MapReduce jobs on YARN, then Spring YARN would be a correct candidate for that.