Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study

Maurya, Kaushal Kumar; Ravindran, Renjith P.; Anirudh, Ch Ram; Murthy, Kavi Narayana

Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study

Date

2020-01-01

Authors

Maurya, Kaushal Kumar

Ravindran, Renjith P.

Anirudh, Ch Ram

Murthy, Kavi Narayana

Abstract

The quality of machine translation (MT) is best judged by humans well versed in both source and target languages. However, automatic techniques are often used as these are much faster, cheaper and language independent. The goal of this paper is to check for correlation between manual and automatic evaluation, specifically in the context of Indian languages. To the extent automatic evaluation methods correlate with the manual evaluations, we can get the best of both worlds. In this paper, we perform a comparative study of automatic evaluation metrics—BLEU, NIST, METEOR, TER and WER, against the manual evaluation metric (adequacy), for English-Hindi translation. We also attempt to estimate the manual evaluation score of a given MT output from its automatic evaluation score. The data for the study was sourced from the Workshop on Statistical Machine Translation WMT14.

Keywords

Automatic metrics, Machine translation (MT), Manual metrics, MT evaluation

Citation

Advances in Intelligent Systems and Computing. v.1079

URI

10.1007/978-981-15-1097-7_45
http://link.springer.com/10.1007/978-981-15-1097-7_45
https://dspace.uohyd.ac.in/handle/1/8972

Collections

Computer and Information Sciences - Publications

Full item page