Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study

No Thumbnail Available
Date
2020-01-01
Authors
Maurya, Kaushal Kumar
Ravindran, Renjith P.
Anirudh, Ch Ram
Murthy, Kavi Narayana
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The quality of machine translation (MT) is best judged by humans well versed in both source and target languages. However, automatic techniques are often used as these are much faster, cheaper and language independent. The goal of this paper is to check for correlation between manual and automatic evaluation, specifically in the context of Indian languages. To the extent automatic evaluation methods correlate with the manual evaluations, we can get the best of both worlds. In this paper, we perform a comparative study of automatic evaluation metrics—BLEU, NIST, METEOR, TER and WER, against the manual evaluation metric (adequacy), for English-Hindi translation. We also attempt to estimate the manual evaluation score of a given MT output from its automatic evaluation score. The data for the study was sourced from the Workshop on Statistical Machine Translation WMT14.
Description
Keywords
Automatic metrics, Machine translation (MT), Manual metrics, MT evaluation
Citation
Advances in Intelligent Systems and Computing. v.1079