IIITK@DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and Kannada

No Thumbnail Available
Date
2021-01-01
Authors
Ghanghor, Nikhil Kumar
Krishnamurthy, Prameshwari
Thavareesan, Sajeetha
Priyadarshini, Ruba
Chakravarthi, Bharathi Raja
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper describes the IIITK team’s submissions to the offensive language identification and troll memes classification shared tasks for Dravidian languages at DravidianLangTech 2021 workshop@EACL 2021. We have used the transformer-based pretrained models along with their customized versions with custom loss functions. State of the art pretrained CNN models were also used for image-related tasks. Our best configuration for Tamil troll meme classification achieved a 0.55 weighted average F1 score, and for offensive language identification, our system achieved weighted F1 scores of 0.75 for Tamil, 0.95 for Malayalam, and 0.71 for Kannada. Our rank on Tamil troll meme classification is 2, and offensive language identification in Tamil, Malayalam, and Kannada is 3, 3 and 4. We have open-sourced our code implementations for all the models across both the tasks on GitHub1
Description
Keywords
Citation
Proceedings of the 1st Workshop on Speech and Language Technologies for Dravidian Languages, DravidianLangTech 2021 at 16th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2021