Performance improvement of MapReduce framework by identifying slow TaskTrackers in heterogeneous Hadoop cluster
Performance improvement of MapReduce framework by identifying slow TaskTrackers in heterogeneous Hadoop cluster
No Thumbnail Available
Date
2016-01-01
Authors
Naik, Nenavath Srinivas
Negi, Atul
Sastry, V. N.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
MapReduce is presently recognized as a significant parallel and distributed programming model with wide acclaim for large scale computing. MapReduce framework divides a job into map, reduce tasks and schedules these tasks in a distributed manner across the cluster. Scheduling of tasks and identification of “slow TaskTrackers” in heterogeneous Hadoop clusters is the focus of recent research. MapReduce performance is currently limited by its default scheduler, which does not adapt well in heterogeneous environments. In this paper, we propose a scheduling method to identify “slow TaskTrackers” in a heterogeneous Hadoop cluster and implement the proposed method by integrating it with the Hadoop default scheduling algorithm. The performance of this method is compared with the Hadoop default scheduler. We observe that the proposed approach shows modest but consistent improvement against the default Hadoop scheduler in heterogeneous environments. We see that it improves by minimizing the overall job execution time.
Description
Keywords
Hadoop,
Heterogeneous environments,
Job scheduling,
MapReduce,
TaskTracker
Citation
Smart Innovation, Systems and Technologies. v.44