YARN is an open source Apache project that stands for “Yet Another Resource Negotiator”. It is a Hadoop cluster manager that is responsible for allocating resources (such as cpu, memory, disk and network), for scheduling & monitoring jobs across the Hadoop cluster. Earlier versions of Hadoop only supported the running of MapReduce jobs on Hadoop cluster; however, the advent of YARN has also made possible to run other big data framework such as Spark, Flink, Samza and many more on Hadoop Cluster. YARN support different types of workloads such as stream processing, batch processing, graph processing and iterative processing.
Apache YARN consist of two main components: Resource Manager and Node Manager. Resource Manager is one per cluster whereas Node Manager daemon run on all worker nodes.
Resource Manager is a daemon that is responsible for allocating resources in the cluster. It has two main components namely Scheduler and Applications Manager. Scheduler is responsible for scheduling the applications across the cluster based on memory and cpu requirements. There is only one Resource Manager per cluster.
Application Manager accepts jobs and create specific Application Master and restarts them in case of failures.
Node Manager is a daemon that run on all worker nodes and manages resources at machine level. Node manager defines the resources that are available on node and keep tracks of usage. It also tracks the health of nodes and if found unhealthy communicate it to resource manager. Node Manager communicates with resource manager to send regular reports about report usage and coordinate with application master to spawn JVM for task execution.
Application Master is responsible for handling entire life cycle of Applications starting with resource negotiation, tracking and monitoring job status.
YARN Supported Frameworks
YARN is not just limited to Hadoop MapReduce; it can be used to run one of the most promising big data consulting services frameworks such as Spark, Flink, Samza and many more. Below list represents all framework that can currently run on top of YARN.
YARN supports three scheduling policies namely FIFO, Capacity and Fair Scheduling that decides how the incoming jobs will be scheduled or prioritized.
FIFO Scheduler: IN FIFO Scheduler policy, applications are served on “First in First out” basis but this policy can lead to job starvations if cluster is shared among multiple users. So, this policy is not optimal in shared clusters. By Default, YARN is always set to FIFO policy.
Capacity Scheduler: In Capacity Scheduler, different organization shares their Hadoop cluster to maximize utilization of cluster. Though organization are sharing their cluster, but Capacity Scheduler make sure that each organization are getting their required capacity. Capacity Scheduler provides capacity guarantees, elasticity, resource-based scheduling, priority scheduling, multi-tenancy and much more. We have to set below property in conf/yarn-site.xml file to enable Capacity Scheduler in YARN.
Fair Scheduler: Fair Scheduling policy make sure that all running jobs get roughly equal shares of resources (memory or cpu). Jobs are divided into queues and resources are shared equally among those queues. It always guarantees minimum share to queue and if queue is empty, excess resources are distributed for jobs running in other queues. We can also define set of rules that get applied to submitted applications so that application land into appropriate queues for further processing.
User can reserve some resources in YARN using reservation system so that critical application always get resource on time. We can mark any leaf queue as reservation queue in Fair & Capacity Scheduler (fair-scheduler.xml/capacity-scheduler.xml). Let’s see how it works:
The user submits the reservation creation request and receives a reservation Id. In the next step, user send the reservation request along with reservation id and a ReservationAgent called as GREE will create a reservation in the Plan (Plan is data structure that maintain & tracks all reservations). In future whenever user applies with reservation id, scheduler will make sure that the application gets the reserved resources. However, when resources are not in use can be used for execution of other applications as well.
Earlier to Hadoop 2.4, Yarn Manager was the single point of failure in YARN cluster. However, after Hadoop 2.4 Resource Manager works in Active/StandBy mode to provide fault tolerance and high availability. Standby Resource Manager always keep track of all changes that is happening in active Resource Manager and can take its place in case of failures. Resource Manager closely works with zookeeper to write its state and to decide which Resource Manager should be active in case of failures. Fail-over Transition from active mode to standby mode can either happens manually or automatically. Manual failover transmission fa can be done by administrator by using “yarn rmadmin” cli whereas in automatic failover transmission zookeeper daemon is used.
Yarn Federation is a technique to club smaller YARN cluster together to appear as one large cluster. Application running on federated cluster can get schedule on any of the nodes of sub cluster. There will be multiple Resource Manager each per cluster. Such architecture provides a lot more flexibility and scalability as separate Resource Manager will be handling part of cluster thus increasing the overall performance of scheduling and monitoring.
YARN Versus Mesos
Apache Mesos is another well-known resource manager in the market. There are few significant differences between the both.
|YARN is written in Java
||Mesos is written in C++
|By default, in YARN is based on memory scheduling only.
||By default, Apache Mesos has memory and cpu scheduling
|Apache YARN is a monolithic scheduler which means it follow a single step to schedule & deploy the job
||Apache Mesos is a non-monolithic follows the two-step process to schedule & deploy.
|Apache Yarn is less scalable.
||Apache Mesos is more scalable
Undoubtedly, YARN is a robust, flexible, configurable extensible resource management engine that supports more than 15 big data frameworks. It allows external system to leverage Hadoop Distributed file system. It is highly in demand and used widely across the industry.