A survey of distance/similarity measures for categorical data

dc.contributor.author Alamuri, Madhavi
dc.contributor.author Surampudi, Bapi Raju
dc.contributor.author Negi, Atul
dc.date.accessioned 2022-03-27T05:53:05Z
dc.date.available 2022-03-27T05:53:05Z
dc.date.issued 2014-09-03
dc.description.abstract Similarity or distance between two objects plays a fundamental role in many data mining tasks like classification and clustering. Categorical data, unlike numeric data, conceptually is deficient of default ordering relations on the attribute values. This makes the task of devising similarity or distance metrics and data mining tasks such as classification and clustering of categorical data more challenging. In this paper we formulate a taxonomy of various distance or similarity measures used in conjunction with data whose attributes are categorical. We categorize the existing measures into two broad classes, namely, Context-free and Context-sensitive measures for categorical data. In addition, we suggest a taxonomy of the clustering approaches for categorical data. We also propose a hybrid approach for measuring similarity between objects. We make a relative comparison of the strengths and weaknesses of some of the similarity measures and point out future research directions.
dc.identifier.citation Proceedings of the International Joint Conference on Neural Networks
dc.identifier.uri 10.1109/IJCNN.2014.6889941
dc.identifier.uri http://ieeexplore.ieee.org/document/6889941/
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8594
dc.subject Categorical data
dc.subject Clustering
dc.subject Similarity
dc.subject Supervised
dc.subject Unsupervised
dc.title A survey of distance/similarity measures for categorical data
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: