IT Gossips: Hadoop clustering components

Tuesday, November 13, 2012

Hadoop clustering is composed with following daemons on one server or across multiple servers

NameNode -- keeps track of the file metadata, which files are in the system and how each file is broken down into blocks
DataNode -- provides backup store of data blocks and constantly report to NameNode to keep track of metadata update
Secondary NameNode -- assistant daemon for monitoring the state of a cluster HDFS. It communicates with the NameNode to take snapshots of the HDFS metadata at intervals defined by the cluster configuration
JobTracker -- liaison between application and Hadoop. It determines the execution plan by determining which files to process, assign nodes to different tasks, and monitors all tasks as they are running.

IT Gossips