A TaskTracker is a slave node daemon in the cluster
that accepts tasks (Map, Reduce and Shuffle operations) from a JobTracker.
There is only One Task Tracker process run on any hadoop slave node. Task
Tracker runs on its own JVM process. Every TaskTracker is configured with a set
of slots, these indicate the number of tasks that it can accept. The
TaskTracker starts a separate JVM processes to do the actual work (called as
Task Instance) this is to ensure that process failure does not take down the
task tracker. The TaskTracker monitors these task instances, capturing the
output and exit codes. When the Task instances finish, successfully or not, the
task tracker notifies the JobTracker. The TaskTrackers also send out heartbeat
messages to the JobTracker, usually every few minutes, to reassure the
JobTracker that it is still alive. These message also inform the JobTracker of
the number of available slots, so the JobTracker can stay up to date with where
in the cluster work can be delegated.
No comments:
Post a Comment