Code smell detection using multi-label classification approach

Guggulothu, Thirupathi; Moiz, Salman Abdul

Code smell detection using multi-label classification approach

Date

2020-09-01

Authors

Guggulothu, Thirupathi

Moiz, Salman Abdul

Abstract

Code smells are characteristics of the software that indicates a code or design problem which can make software hard to understand, evolve, and maintain. There are several code smell detection tools proposed in the literature, but they produce different results. This is because smells are informally defined or subjective in nature. Machine learning techniques help in addressing the issues of subjectivity, which can learn and distinguish the characteristics of smelly and non-smelly source code elements (classes or methods). However, the existing machine learning techniques can only detect a single type of smell in the code element that does not correspond to a real-world scenario as a single element can have multiple design problems (smells). Further, the mechanisms proposed in the literature could not detect code smells by considering the correlation (co-occurrence) among them. To address these shortcomings, we propose and investigate the use of multi-label classification (MLC) methods to detect whether the given code element is affected by multiple smells or not. In this proposal, two code smell datasets available in the literature are converted into a multi-label dataset (MLD). In the MLD, we found that there is a positive correlation between the two smells (long method and feature envy). In the classification phase, the two methods of MLC considered the correlation among the smells and enhanced the performance (on average more than 95% accuracy) for the 10-fold cross-validation with the ten iterations. The findings reported help the researchers and developers in prioritizing the critical code elements for refactoring based on the number of code smells detected.

Keywords

Code smell correlation, Code smells, Code smells detection, Machine learning techniques, Multi-label classification, Refactoring, Software quality

Citation

Software Quality Journal. v.28(3)

URI

10.1007/s11219-020-09498-y
http://link.springer.com/10.1007/s11219-020-09498-y
https://dspace.uohyd.ac.in/handle/1/9176

Collections

Computer and Information Sciences - Publications

Full item page