Code smell detection using multi-label classification approach
Code smell detection using multi-label classification approach
No Thumbnail Available
Date
2020-09-01
Authors
Guggulothu, Thirupathi
Moiz, Salman Abdul
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Code smells are characteristics of the software that indicates a code or design problem which can make software hard to understand, evolve, and maintain. There are several code smell detection tools proposed in the literature, but they produce different results. This is because smells are informally defined or subjective in nature. Machine learning techniques help in addressing the issues of subjectivity, which can learn and distinguish the characteristics of smelly and non-smelly source code elements (classes or methods). However, the existing machine learning techniques can only detect a single type of smell in the code element that does not correspond to a real-world scenario as a single element can have multiple design problems (smells). Further, the mechanisms proposed in the literature could not detect code smells by considering the correlation (co-occurrence) among them. To address these shortcomings, we propose and investigate the use of multi-label classification (MLC) methods to detect whether the given code element is affected by multiple smells or not. In this proposal, two code smell datasets available in the literature are converted into a multi-label dataset (MLD). In the MLD, we found that there is a positive correlation between the two smells (long method and feature envy). In the classification phase, the two methods of MLC considered the correlation among the smells and enhanced the performance (on average more than 95% accuracy) for the 10-fold cross-validation with the ten iterations. The findings reported help the researchers and developers in prioritizing the critical code elements for refactoring based on the number of code smells detected.
Description
Keywords
Code smell correlation,
Code smells,
Code smells detection,
Machine learning techniques,
Multi-label classification,
Refactoring,
Software quality
Citation
Software Quality Journal. v.28(3)