K-NN Sampling for Visualization of Dynamic Data Using LION-tSNE
K-NN Sampling for Visualization of Dynamic Data Using LION-tSNE
No Thumbnail Available
Date
2019-12-01
Authors
Dharamsotu, Bheekya
Rani, K. Swarupa
Moiz, Salman Abdul
Rao, C. Raghavendra
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Dimensionality reduction algorithms are often used to visualize multi-dimensional data, which are mostly non-parametric. Non-parametric methods do not provide any explicit intuition for adding new data points into an existing environment which limits the applicability of visualization for Big Data scenario. The LION-tSNE (Local Interpolation with Outlier coNtrol t-Distributed Stochastic Neighbor Embedding) method was proposed to overcome the limitations of existing techniques. The LION-tSNE algorithm uses random sampling method for tSNE model design which creates an initial visual environment then new data points are added to this environment using local-IDW(Inverse Distance Weighting) interpolation method. The randomly selected sample data often suffer from non-representativeness of the whole data which creates inconsistency in the tSNE environment. To overcome this problem two new sampling methods are proposed which are based on k-NN (k-Nearest Neighbor) graph update properties. It is empirically shown that proposed methods outperform existing LION-tSNE method with 0.5 to 2% more k-NN accuracy and results are more consistent. The study is done on five differently characterized datasets with three different initial solutions of tSNE. The proposed method results are statistically significant which is done by statistical method pairwise t-test.
Description
Keywords
Big Data,
Dimensionality reduction,
Interpolation,
k-NN graph,
Sampling,
t-Distributed Stochastic Neighbor Embedding,
visualizatio
Citation
Proceedings - 26th IEEE International Conference on High Performance Computing, HiPC 2019