Monday, 25 September 2017

Introduction To Hadoop

What is Hadoop?

Hadoop is open-source software for reliable, scalable, distributed computing.

Hadoop is an open source, Java-based programming framework that supports the processing and storage of very huge data sets in a distributed computing environment. 

It is part of the Apache project sponsored by the Apache Software Foundation
Hadoop dedicated to store and analyze the large sets of unstructured data
The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. 

Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers packaged code into nodes to process the data in equivalent. 

This approach takes benefit of data locality, where nodes influence the data they have access to. 

This allows the dataset to be processed quicker and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking.

The 4 Modules of Hadoop
  1. Hadoop Distributed File-System (HDFS)
  2. MapReduce
  3. Hadoop Common
  4. YARN

Hadoop Distributed File-System (HDFS) :

A "file system" is the method used by a computer to store data, so it can be found and used. Usually this is determined by the computer's operating system, however a Hadoop system uses its own file system which meets "above" the file system of the host computer - meaning it can be accessed using any computer running any supported OS

MapReduce :

an implementation of the MapReduce programming model for major data processing.

Mapreduce has two operation defined as in name :  reading data from the database, putting it into a format appropriate for analysis (map), and performing mathematical operations i.e counting the number of males aged 29+ in a customer database (reduce).

Hadoop Common :

contains libraries and utilities needed by other Hadoop modules.

hadoop common provides the tools (in Java) needed for the user's computer systems (Windows, Unix or whatever) to read data stored under the Hadoop file system.


a platform responsible for managing computing resources in clusters and using them for scheduling users' applications. 

which manages resources of the systems storing the data and running the analysis.

Learn more from hadoop tutorial
Hadoop introduction


  1. Informative post about hadoop, i am looking forward for realtime hadoop online training institute.

  2. Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
    hadoop training in bangalore

  3. This comment has been removed by the author.

  4. This comment has been removed by the author.

  5. Expected to form you a next to no word to thank you once more with respect to the decent recommendations you've contributed here.
    hadoop training in bangalore

  6. Excellent Sharing. You have done great job. I gathered lots of new information... Big Data Training Institute in Chennai

  7. The great service in this blog and the nice technology is visible in this blog. I am really very happy for the nice approach is visible in this blog and thank you very much for using the nice technology in this blog

    hadoop training in chennai|
    hadoop training in bangalore|

  8. This information is impressive. I am inspired by your post writing style. Thanks for taking the time to share this, I feel happy about it and I love learning more about this topic.
    Final Year Project Center in Chennai | Final Year Project Center in Velachery

  9. Informative blog and it was up to the point describing the information very effectively. Thanks to blog author for wonderful and informative post.
    Mobile application developers in Chennai | Android application developers in Chennai | Android app developers Chennai | PHP developers chennai

  10. I just needed to record a speedy word to express profound gratitude to you for those magnificent tips and clues you are appearing on this site.
    big data training in bangalore

  11. Expected to form you a next to no word to thank you once more with respect to the decent recommendations you've contributed here.

    mobile website builder

  12. I feel really happy to have seen your webpage and look forward to so many more entertaining times reading here. Thanks once more for all the details.