hadoophdfshadoop-yarnnamenode

When do YARN and NameNode interact


When a job is submitted, when do YARN and NameNode interact? When a job is submitted, who does it get sent to? Could someone explain the end-to-end flow - how hadoop ecosystem works?

Thanks!


Solution

  • Namenode: Stores the meta-data of all the data stored in data nodes and monitors the health of data nodes. Basically, it is a master-slave architecture.

    YARN: It stands for Yet Another Resource Negotiator. The yarn has mainly two components.

    1.> Scheduling

    2.> Application Manager

    Yarn also contains the master, i.e Resource Manager and Slave, i.e Node Manager.

    For scheduling purpose, there are 3 Schedulers:

    1.> FIFO 2.> Capacity 3.> Fair-share

    There is a component called Application Master assigned by Resource Manager under the Node Manager.

    One application master is assigned to one application.

    The job is directly submitted by the client and Resource Manager assigns the job to the Application Master and Node manager monitors the liveliness of Application Master

    Now, whenever the job comes in, Resource manager creates a job id and assign an Application Master for that job. Resource Manager contacts to the Namenode to retrieve the information about the metadata of the required data on which the task has to be performed. And the information received by Resource Manager is then passed to Application Master.

    This is the basic overview of the working of Yarn with Namenode. You can also read in detail from YARN

    Also, NameNode interaction is just in the Hadoop applications running within YARN that talk to the NameNode. Not all YARN applications need to communicate with HDFS