Enhancing the performance of MapReduce default scheduler by detecting prolonged TaskTrackers in heterogeneous environments

No Thumbnail Available
Date
2015-01-01
Authors
Naik, Nenavath Srinivas
Negi, Atul
Sastry, V. N.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
MapReduce is now a significant parallel processing model for large-scale data-intensive applications using clusters with commodity hardware. Scheduling of jobs and tasks, and identification of TaskTrackers which are slow in Hadoop clusters are the focus research in the recent years. MapReduce performance is currently limited by its default scheduler, which does not adapt well in heterogeneous environments. In this paper, we propose a scheduling method to identify the TaskTrackers which are running slowly in map and reduce phases of the MapReduce framework in a heterogeneous Hadoop cluster. The proposed method is integrated with the MapReduce default scheduling algorithm. The performance of this method is compared with the unmodified MapReduce default scheduler. We observe that the proposed approach shows improvements in performance to the default scheduler in the heterogeneous environments. Performance improvement was observed as the overall job execution times for different workloads from HiBench benchmark suite were reduced.
Description
Keywords
Heterogeneous environment, MapReduce, Task scheduler, TaskTrackers
Citation
Advances in Intelligent Systems and Computing. v.380