Looking for an Expert Development Team? Take two weeks Trial! Try Now

Ansible (DevOps): Creating Roles to Set Up Hadoop

banner

DevOps is a bunch of programming advancement practices that consolidate Software Development (Dev) and Information Technology Operations (Ops). In this blog we will be using the tool Ansible - IT configuration and Deployment Tool to automate a Hadoop cluster.

Ansible (DevOps) Creating Roles to Set Up Hadoop

What is Hadoop?

Hadoop is an assortment of open-source programming utilities that utilizing numerous computers associated through a network takes care of the issues involving huge information and computation. It goes under Apache Software Foundation.

What issue do we solve by automating Hadoop?

With the advancement in technology, time is becoming a major issue. Several things are done manually which takes time. In today’s era, almost all companies are facing the problem to store and process their large amount of data. Suppose a new system comes into the industry, we have to deploy all the codes according to our needs that already exists in the industry and which takes a lot of time to do all the changes. These things can be now done through automation.

Why do we need to automate Hadoop?

Deploying an infrastructure grade Hadoop cluster is a monumental task and can take a lot of time to deploy as every system needs to be configured for its specific purposes like data nodes for storage, nodes for job scheduling and processing, etc. Our expert big data hadoop developer will implement HDFS that is mainly used for data storage.

Hadoop Architecture

Hadoop Architecture

Ansible Architectural Diagram:

Ansible Architectural Diagram

The software and hardware requirements of this project are as follows:

Sr. No.SOFTWAREHARDWARE
1.RHEL 7.5 and aboveA minimum of 1 GHz processor

2.YML, JINJAA minimum of 1 GB RAM
3.Ansible, HTTP, HadoopNo strict specifications about hard disk

METHODOLOGY

Hadoop:

The primary infrastructure software services aimed to automate are as follows:

Step 1: Install the ansible package in Linux using yum. In this, we are installing an ansible package using the “yum install ansible” command. Before this, yum is to be configured.

Step 2: Then make Ansible galaxy of Hadoop Cluster in some different folders like playbooks. This Hadoop cluster is implemented to solve the big data problem using the command “Ansible-galaxy init hadoop cluster”.

Step 3: Accordingly put the client IP in the host’s file so that it can read the IP from there and so that playbook can be automatically run in that system. The location of a host file would be “/etc/ansible/hosts”.

Step 4: Configure the ansible file according to the need in ansible.cfg file.

Step 5: Write a Hadoop cluster role to set-up a Master Node, Slave Node.

Step 6: Then create a site.yml file in which write a code to import the role Hadoop.

Step 7: Execute the file using the command “ansible-playbook site.yml”.

Step 8: In the Client node role: we copy the Java and Hadoop setup files to the respective nodes.

Step 9: In the Master node role: we copy the master node configuration ie core-site.xml and hdfs-site.xml on the master node machine.

Step 10: In the Slave node role: we copy the slave-node configuration ie core-site.xml and hdfs-site.xml on the master node machine.

Step 11: Now we have to run the following command:

On Name Node - “hadoop-daemon.sh start namenode”

On Data Node -“hadoop-daemon.sh start datanode”

Step 12: On the client, the machine checks the Hadoop setup by running the following command:

“ hadoop hdfs admin -report”

Step 13: To upload a file using the command: “ hadoop fs -put filename / ”

Step 14: To upload a file using the command: “ hadoop fs cat /filename ”

All the respective yml files are listed below:

NOTE: STRICT IDENTATIONS ARE TO BE USED

Site.yml

-name: deploy slave node

import_playbook: sn.yml

-name: deploy client node

import_playbook: cn.yml

Sn.yml

-hosts: dn

roles:

- role: slavenode

Inventory

[dn] DATA NODE

192.168.56.115 ansible_user=root ansible_password=redhat #slave1

192.168.56.116 ansible_user=root ansible_password=redhat #slave2

192.168.56.116 ansible_user=root ansible_password=redhat #slave

No

[nn] MASTER NODE

192.168.56.114 ansible_user=root ansible_password=redhat #master

Client role: main.yml

-command: "rpm -ivh hadoop-1.2.1-1.x86_64.rpm --force"

-command: "rpm -ivh jdk-8u171-linux-x64.rpm --force"

-template:

src: ".bashrc"

dest: "/root/.bashrc"

-template:

src: "core-site.xml.j2"

dest: "/etc/hadoop/core-site.xml"

Master Node: main.yml

-command: "rpm -ivh hadoop-1.2.1-1.x86_64.rpm --force"

-command: "rpm -ivh jdk-8u171-linux-x64.rpm --force"

-template:

src: ".bashrc"

dest: "/root/.bashrc"

-file:

path: /master state:

directory

21

-template:

src: "hdfs-site.xml"

dest: "/etc/hadoop/hdfs-site.xml"

-template:

src: "core-site.xml.j2"

dest: "/etc/hadoop/core-site.xml"

-command: "hadoop namenode -format -force"

#- command: "hadoop-daemon.sh start namenode"

Core-Site.xml

<configuration>

<property>

<name>fs.default.name

{% for i in groups["nn"] %}

<value>hdfs://{{ i }}:9001

{% endfor %}

</property>

</configuration>

SCREENSHOTS:

HADOOP

Hadoop

CLUSTERS SUMMARY

Cluster-Summary
Cluster-Summary-2

FILE UPLOADS

File Uploads

Read More:

DMCA Logo do not copy