A review of adaptive approaches to MapReduce scheduling in heterogeneous environments

No Thumbnail Available
Date
2014-11-26
Authors
Naik, Nenavath Srinivas
Negi, Atul
Sastry, V. N.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
MapReduce is currently a significant model for distributed processing of large-scale data intensive applications. MapReduce default scheduler is limited by the assumption that nodes of the cluster are homogeneous and that tasks progress linearly. This model of MapReduce scheduler is used to decide speculatively re-execution of straggler tasks. The assumption of homogeneity does not always hold in practice. MapReduce does not fundamentally consider heterogeneity of nodes in computer clusters. It is evident that total job execution time is extended by the straggler tasks in heterogeneous environments. Adaptation to Heterogeneous environment depends on computation and communication, architectures, memory and power. In this paper, first we explain about existing scheduling algorithms and their respective characteristics. Then we review some of the approaches of scheduling algorithms like LATE, SAMR and ESAMR, which have been aimed specifically to make the performance of MapReduce adaptive in heterogeneous environments. Additionally, we have also introduced a novel approach for scheduling processes for MapReduce scheduling in heterogeneous environments that is adaptive and thus learns from past execution performances.
Description
Keywords
Hadoop, Heterogeneous environment, MapReduce, Speculative execution, Task Scheduling
Citation
Proceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2014