Clustering on the cloud: Reducing CLARA to MapReduce

dc.contributor.author Jakovits, Pelle
dc.contributor.author Srirama, Satish Narayana
dc.date.accessioned 2022-03-27T00:16:34Z
dc.date.available 2022-03-27T00:16:34Z
dc.date.issued 2013-10-01
dc.description.abstract Cloud computing, with its promise of virtually limitless resources, seems to suit well in solving resource intensive problems from machine learning and data mining domains, by allowing to scale any distributed data mining or machine learning application with little difficulty. However, to be able to run these applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like Hadoop MapReduce. It offers both automatic parallelization and fault tolerance on the cloud commodity hardware. However, it is not trivial to adapt complex algorithms to MapReduce model, as often it is more suited for simple and embarrassingly parallel algorithms. Yet, there are some types of more complex algorithms that are suitable for MapReduce and in this work we look at one such algorithm, Clustering LARge Applications (CLARA), which can be used for clustering extra large number of objects. The paper describes how CLARA is reduced to MapReduce model along with a detailed analysis in the Hadoop MapReduce implementation. The paper also provides a case study where the algorithm is successfully applied in clustering pen-based recognition of handwritten digits data set. © 2013 ACM.
dc.identifier.citation ACM International Conference Proceeding Series
dc.identifier.uri 10.1145/2513534.2513546
dc.identifier.uri http://dl.acm.org/citation.cfm?doid=2513534.2513546
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/3162
dc.subject CLARA
dc.subject classification
dc.subject cloud computing
dc.subject k-medoid clustering
dc.subject MapReduce
dc.title Clustering on the cloud: Reducing CLARA to MapReduce
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: