Hadoop clustering is composed with following daemons on one server or across multiple servers
- NameNode -- keeps track of the file metadata, which files are in the system and how each file is broken down into blocks
- DataNode -- provides backup store of data blocks and constantly report to NameNode to keep track of metadata update
- Secondary NameNode -- assistant daemon for monitoring the state of a cluster HDFS. It communicates with the NameNode to take snapshots of the HDFS metadata at intervals defined by the cluster configuration
- JobTracker -- liaison between application and Hadoop. It determines the execution plan by determining which files to process, assign nodes to different tasks, and monitors all tasks as they are running.
- one per Hadoop cluster
- automatic relaunch failed task
- oversees the overall execution of a MapReduce job
- TaskTracker -- slave to the JobTracker
- executes individual tasks that the JobTracker assigns
- one per a slave node
- able to spawn multiple map or reduce tasks in parallel
- send heartbeat to JobTracker
No comments:
Post a Comment