About Me

Love JAVA related technologies. Recently researching on Enterprise Integration (SOA and Messaging), Mobility and Big Data. I have working in JAVA related technologies as Software Architect, Enterprise Architect and Software Developer/Engineer for over 11 years. Currently, I am working as Senior Consultant of VMWare Inc.

Tuesday, November 13, 2012

Hadoop clustering components

Hadoop clustering is composed with following daemons on one server or across multiple servers
  • NameNode -- keeps track of the file metadata, which files are in the system and how each file is broken down into blocks
  • DataNode -- provides backup store of data blocks and constantly report to NameNode to keep track of metadata update
  • Secondary NameNode -- assistant daemon for monitoring the state of a cluster HDFS. It communicates with the NameNode to take snapshots of the HDFS metadata at intervals defined by the cluster configuration
  • JobTracker -- liaison between application and Hadoop.  It determines the execution plan by determining which files to process, assign nodes to different tasks, and monitors all tasks as they are running.
    • one per Hadoop cluster
    • automatic relaunch failed task
    • oversees the overall execution of a MapReduce job
  • TaskTracker -- slave to the JobTracker
    • executes individual tasks that the JobTracker assigns
    • one per a slave node
    • able to spawn multiple map or reduce tasks in parallel
    • send heartbeat to JobTracker

No comments: