Introduction and Problem
Big Data Hadoop is widely acclimated by companies these days and with average 50+ nodes cluster and 100+ TB storage used in most of the enterprises there are huge number of errors and issues faced by Big Data Consulting Company India and Admins every single day. One of the utmost typical and toughest errors faced by them is disk related issues where there is no option left but to replace the disk where Hadoop data is stored and this faulty Disk is needed to be replaced by a new one.
Disk replacement activity in Big Data Hadoop mostly involves a worker/datanode going down/showing bad health due to one of the disk failed for that particular node. Please note that this applies to all the distributions used for Hadoop e.g. Apache, Cloudera, Hortonworks, MapR, IBM, AWS etc.
The Hadoop solution here is to perform Pre Disk Replacement tasks for Hadoop worker node to ensure data should not be corrupted/lost, while/post disk replacement.
Hence a typical Disk Replacement activity will involve below major tasks (for a given Big Data Hadoop worker/Data node):-