Applied Linguistics and Translation Studies
Permanent URI for this community
Browse
Browsing Applied Linguistics and Translation Studies by Title
Results Per Page
Sort Options
-
ItemA transfer-rule based verb phrase translation from English to Tamil( 2018-01-01) Parameswari, K. ; Nagaraju, V. ; Angeline Linda, K.Building a machine translation (MT) between non-cognate languages always poses number issues as there are lots of translation divergences involved. In transfer-based MT, a systematic way of formulating transfer rules are required to handle linguistic differences between languages. This paper explains three-stages in which the transfer-based machine translation (MT) are built for translating verb phrases from English to Tamil.
-
ItemAcquisition of noun-noun compounds: a study of kannada-speaking childern(University of Hyderabad, 2012-06-16) Swathi, P.G ; Banerjee, Somsukla
-
ItemAnusaaraka \b an approach for MT taking insights from the Indian grammatical tradition(University of Hyderabad, 2009-10-30) Padmanathrao, A.A. ; Uma Maheshwara Rao, G.
-
ItemArgument structure of Telugu verbs(University of Hyderabad, 2003-12-23) Chenna Kesava Murthy, M. ; Uma Maheshwara Rao, G.
-
ItemChild bilingualism \b a study in second language acquisition(University of Hyderabad, 1996-05-26) Shailendra Kumar Singh ; Singh, Udaya Narayana
-
ItemComplement clauses in Hindi and Gujarati(University of Hyderabad, 1995-01-28) Ara Shah
-
ItemComputational study of transitivity(University of Hyderabad, 1995-09-25) Bhattacharya, Tanmoy ; Dasgupta, Proba
-
ItemCulture, conflict and social fabric - a study of diane glancy's select plays(University of Hyderabad, 2012-03-19) Savitha, C ; Suneetha Rani, N
-
ItemDEVELOPMENT OF SAN’ANI ARABIC PARTS-OF-SPEECH TAGGER: A BI-GRUs-CRF MODEL(University of Hyderabad, 2021-12) Mohammed Mohammed Nasser Al-Shehabi, Sabah ; Rajyarama, K.One of the essential pre-processing tasks for building and improving NLP applications is known as parts-of-speech tagging. The tagging process involves the assigning of an appropriate part of speech tag to each word/token in a text. It also plays a fundamental role in developing many natural language processing applications such as syntactic parsing, named-entity recognition, automatic translation, ontology engineering, question answering, and information retrieval. In
-
ItemDevelopment of Telugu-Tamil transfer-based machine translation system: An improvization using divergence index( 2019-07-01) Krishnamurthy, ParameswariBuilding an automatic, high-quality, robust machine translation (MT) system is a fascinating yet an arduous task, as one of the major difficulties lies in cross-linguistic differences or divergences between languages at various levels. The existence of translation divergence precludes straightforward mapping in the MT system. An increase in the number of divergences also increases the complexity, especially in linguistically motivated transfer-based MT systems. This paper discusses the development of Telugu-Tamil transfer-based MT and how a divergence index (DI) is built to quantify the number of parametric variations between languages in order to improve the success rate of MT. The DI facilitates MT in proposing where to put efforts for the given language pair to attain better and faster results. In addition, handling strategies of different types of divergences in a transfer-based approach to MT are discussed. The paper also includes the evaluation method and how an improvization takes place with the application of DI in MT.
-
Item“Do You See and Hear More? A Study on Telugu Perception Verbs”( 2022-01-01) Krishna, P. Phani ; Arulmozi, S. ; Mishra, Ramesh KumarVerbs of perception describe the actual perception of some entity and it is emphasized by earlier researchers that lexicon in languages is conceptually-oriented and is necessary for our daily communicative needs. In this paper, we demonstrate and explain, which among the perception verbs have the higher frequencies of all the five senses (vision, hear, smell, taste, touch) by using a Telugu corpus and self-rating task. This study shows a greater lexical differentiation when compared to studies done using English corpus and other languages. Based on our analysis–vision, followed by hear are the most commonly used verbs in daily communicative needs by the Telugu speakers as compared to touch, taste, and smell; The inconsistency in usage of other senses are not identical to the vision and hear in other studies, it may be due to sampling and methodological variations in the corpus of different language, but in common these two senses play a key role in perception verbs. The study of Telugu perception verbs may give more interesting facts and insights into the cognitive linguistics paradigm.
-
ItemDynamics of the english hyper word 'of' and itsfunctional equivalents in telugu(University of Hyderabad, 2010-06-19) Srikanth, M. ; Dadegaonkar, Padmakar
-
ItemAn elaboration on the impact of culture on translation of educational texts in multicultural environments(University of Hyderabad, 2012-04-30) Mehdi Asadzadeh ; Uma Maheshwara Rao, G.
-
ItemEnglish - Hindi bilingual electronic thesaurus for translators: a prototype(University of Hyderabad, 2013-12-23) Shamla Medhar ; Shivarama, Padikkal
-
ItemEvaluation of hindi primers and suggestions of one uisng HTML(University of Hyderabad, 2006-01-30) Prachi Chaturvedi ; Bapuji, B.R
-
ItemGrammar extraction from treebanks for Hindi and telugu( 2010-01-01) Kolachina, Prasanth ; Kolachina, Sudheer ; Singh, Anil Kumar ; Husain, Samar ; Naidu, Viswanatha ; Sangal, Rajeev ; Bharati, AksharGrammars play an important role in many Natural Language Processing (NLP) applications. The traditional approach to creating grammars manually, besides being labor-intensive, has several limitations. With the availability of large scale syntactically annotated tree-banks, it is now possible to automatically extract an approximate grammar of a language in any of the existing formalisms from a corresponding treebank. In this paper, we present a basic approach to extract grammars from dependency treebanks of two Indian languages, Hindi and Telugu. The process of grammar extraction requires a generalization mechanism. Towards this end, we explore an approach which relies on generalization of argument structure over the verbs based on their syntactic similarity. Such a generalization counters the effect of data sparseness in the treebanks. A grammar extracted using this system can not only expand already existing knowledge bases for NLP tasks such as parsing, but also aid in the creation of grammars for languages where none exist. Further, we show that the grammar extraction process can help in identifying annotation errors and thus aid in the task of the treebank validation.
-
ItemHistory, features, and typology of language corpora( 2018-03-05) Dash, Niladri Sekhar ; Arulmozi, S.This book discusses key issues of corpus linguistics like the definition of the corpus, primary features of a corpus, and utilization and limitations of corpora. It presents a unique classification scheme of language corpora to show how they can be studied from the perspective of genre, nature, text type, purpose, and application. A reference to parallel translation corpus is mandatory in the discussion of corpus generation, which the authors thoroughly address here, with a focus on Indian language corpora and English. Web-text corpus, a new development in corpus linguistics, is also discussed with elaborate reference to Indian web text corpora. The book also presents a short history of corpus generation and provides scenarios before and after the advent of computer-generated digital corpora. This book has several important features: it discusses many technical issues of the field in a lucid manner; contains extensive new diagrams and charts for easy comprehension; and presents discussions in simplified English to cater to the needs of non-native English readers. This is an important resource authored by academics who have many years of experience teaching and researching corpus linguistics. Its focus on Indian languages and on English corpora makes it applicable to students of graduate and postgraduate courses in applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.
-
ItemHolistic spatial semantics and post-Talmian motion event typology: A case study of Thai and Telugu( 2018-11-01) Naidu, Viswanatha ; Zlatev, Jordan ; Duggirala, Vasanta ; Van De Weijer, Joost ; Devylder, Simon ; Blomberg, JohanLeonard Talmy's influential binary motion event typology has encountered four main challenges: (a) additional language types; (b) extensive "type-internal"variation; (c) the role of other relevant form classes than verbs and "satellites;"and (d) alternative definitions of key semantic concepts like Motion, Path and Manner. After reviewing these issues, we show that the theory of Holistic Spatial Semantics provides analytical tools for their resolution. In support, we present an analysis of motion event descriptions by speakers of two languages that are troublesome for the original typology: Thai (Tai-Kadai) and Telugu (Dravidian), based on the Frog-story elicitation procedure. Despite some apparently similar typological features, the motion event descriptions in the two languages were found to be significantly different. The Telugu participants used very few verbs in contrast to extensive case marking to express Path and nominals to express Region and Landmark, while the Thai speakers relied largely on serial verbs for expressing Path and on prepositions for expressing Region. Combined with previous research in the field, our findings imply (at least) four different clusters of languages in motion event typology with Telugu and Thai as representative of two such clusters, languages like French and Spanish representing a third cluster, and Swedish and English a fourth. This also implies that many other languages like Italian, Bulgarian, and Basque will appear as "mixed languages,"positioned between two or three of these clusters.
-
ItemIIITK@DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and Kannada( 2021-01-01) Ghanghor, Nikhil Kumar ; Krishnamurthy, Prameshwari ; Thavareesan, Sajeetha ; Priyadarshini, Ruba ; Chakravarthi, Bharathi RajaThis paper describes the IIITK team’s submissions to the offensive language identification and troll memes classification shared tasks for Dravidian languages at DravidianLangTech 2021 workshop@EACL 2021. We have used the transformer-based pretrained models along with their customized versions with custom loss functions. State of the art pretrained CNN models were also used for image-related tasks. Our best configuration for Tamil troll meme classification achieved a 0.55 weighted average F1 score, and for offensive language identification, our system achieved weighted F1 scores of 0.75 for Tamil, 0.95 for Malayalam, and 0.71 for Kannada. Our rank on Tamil troll meme classification is 2, and offensive language identification in Tamil, Malayalam, and Kannada is 3, 3 and 4. We have open-sourced our code implementations for all the models across both the tasks on GitHub1
-
ItemImproving the performance of the link parser( 2009-12-01) Naidu, Y. Viswanatha ; Singh, Anil Kumar ; Sharma, Dipti Misra ; Bharati, AksharThe paper describes an approach to extend the coverage of a Link Grammar based parser on the constructions that are not being handled currently by the grammar. There are about thirty types of constructions which we have identified till now. In order to make Link Grammar handle these constructions, we introduce a preprocessor and a postprocessor. The idea is to handle such constructions via some analysis and transformations in a preprocessing phase before the sentence is given to the Link Parser and then by adding the missing links in the postprocessing phase. The main part of the paper discusses the constructions not handled by the parser and introduces rule based preprocessor and postprocessor. This simple and flexible approach is able to increase the coverage of the parser significantly and allows even a relatively naive user to improve the performance of the parser without disturbing the core grammar. © 2009 IEEE.