Speeding-up the prototype based kernel k-means clustering method for large data sets

Hitendra Sarma, T.; Viswanath, P.; Negi, Atul

Speeding-up the prototype based kernel k-means clustering method for large data sets

dc.contributor.author	Hitendra Sarma, T.
dc.contributor.author	Viswanath, P.
dc.contributor.author	Negi, Atul
dc.date.accessioned	2022-03-27T05:52:53Z
dc.date.available	2022-03-27T05:52:53Z
dc.date.issued	2016-10-31
dc.description.abstract	Kernel k-means is seen as a non-linear extension of the k-means clustering method, with good performance in identifying non-isotropic and linearly inseparable clusters. However space and time requirement of kernel k-means is expensive with O(n2) complexity. Present applications with large in-memory computations make this method insuitable for large data sets. Recently, a simple prototype based hybrid approach speedsup kernel k-means method for large data sets [1]. The time complexity of this method is O(n + p2), where p is the number of prototypes. Each prototype is a representative pattern of a group-let of size (threshold) τ. The time complexity of this method not only depends upon p but which in turn depends on clustering threshold. Increasing the threshold value can decrease the number of prototypes p, but, quality of the clustering result might suffer. Hence fixing the appropriate value of the threshold is the major challenge in this approach. This paper, presents a solution to this problem, by allowing τ to vary, depending on the location of the group-let in the space. Intuitively, If the grouplet is close to a cluster center (and away from others) then its size could be large, but if it is lying somewhere between two cluster centers, then its size should be small. It is experimentally shown that this reduces the clustering time and also increases the clustering accuracy. The presented method is a suitable one for large data sets like in data mining.
dc.identifier.citation	Proceedings of the International Joint Conference on Neural Networks. v.2016-October
dc.identifier.uri	10.1109/IJCNN.2016.7727432
dc.identifier.uri	http://ieeexplore.ieee.org/document/7727432/
dc.identifier.uri	https://dspace.uohyd.ac.in/handle/1/8575
dc.subject	Data mining
dc.subject	Kernel k-means clustering method
dc.title	Speeding-up the prototype based kernel k-means clustering method for large data sets
dc.type	Conference Proceeding. Conference Paper
dspace.entity.type

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer and Information Sciences - Publications