Linguistically Motivated Statistical Machine Translation

Linguistically Motivated Statistical Machine Translation

Author: Deyi Xiong

Publisher: Springer

Published: 2015-02-11

Total Pages: 159

ISBN-13: 9812873562

DOWNLOAD EBOOK

This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.


Book Synopsis Linguistically Motivated Statistical Machine Translation by : Deyi Xiong

Download or read book Linguistically Motivated Statistical Machine Translation written by Deyi Xiong and published by Springer. This book was released on 2015-02-11 with total page 159 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.


Using Linguistic Knowledge in Statistical Machine Translation

Using Linguistic Knowledge in Statistical Machine Translation

Author: Rabih Mohamed Zbib

Publisher:

Published: 2010

Total Pages: 162

ISBN-13:

DOWNLOAD EBOOK

In this thesis, we present methods for using linguistically motivated information to enhance the performance of statistical machine translation (SMT). One of the advantages of the statistical approach to machine translation is that it is largely language-agnostic. Machine learning models are used to automatically learn translation patterns from data. SMT can, however, be improved by using linguistic knowledge to address specific areas of the translation process, where translations would be hard to learn fully automatically. We present methods that use linguistic knowledge at various levels to improve statistical machine translation, focusing on Arabic-English translation as a case study. In the first part, morphological information is used to preprocess the Arabic text for Arabic-to-English and English-to-Arabic translation, which reduces the gap in the complexity of the morphology between Arabic and English. The second method addresses the issue of long-distance reordering in translation to account for the difference in the syntax of the two languages. In the third part, we show how additional local context information on the source side is incorporated, which helps reduce lexical ambiguity. Two methods are proposed for using binary decision trees to control the amount of context information introduced. These methods are successfully applied to the use of diacritized Arabic source in Arabic-to-English translation. The final method combines the outputs of an SMT system and a Rule-based MT (RBMT) system, taking advantage of the flexibility of the statistical approach and the rich linguistic knowledge embedded in the rule-based MT system.


Book Synopsis Using Linguistic Knowledge in Statistical Machine Translation by : Rabih Mohamed Zbib

Download or read book Using Linguistic Knowledge in Statistical Machine Translation written by Rabih Mohamed Zbib and published by . This book was released on 2010 with total page 162 pages. Available in PDF, EPUB and Kindle. Book excerpt: In this thesis, we present methods for using linguistically motivated information to enhance the performance of statistical machine translation (SMT). One of the advantages of the statistical approach to machine translation is that it is largely language-agnostic. Machine learning models are used to automatically learn translation patterns from data. SMT can, however, be improved by using linguistic knowledge to address specific areas of the translation process, where translations would be hard to learn fully automatically. We present methods that use linguistic knowledge at various levels to improve statistical machine translation, focusing on Arabic-English translation as a case study. In the first part, morphological information is used to preprocess the Arabic text for Arabic-to-English and English-to-Arabic translation, which reduces the gap in the complexity of the morphology between Arabic and English. The second method addresses the issue of long-distance reordering in translation to account for the difference in the syntax of the two languages. In the third part, we show how additional local context information on the source side is incorporated, which helps reduce lexical ambiguity. Two methods are proposed for using binary decision trees to control the amount of context information introduced. These methods are successfully applied to the use of diacritized Arabic source in Arabic-to-English translation. The final method combines the outputs of an SMT system and a Rule-based MT (RBMT) system, taking advantage of the flexibility of the statistical approach and the rich linguistic knowledge embedded in the rule-based MT system.


Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation

Author: Philip Williams

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 190

ISBN-13: 3031021649

DOWNLOAD EBOOK

This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations such as beam search and cube pruning, data structures, and parsing algorithms. The book consistently highlights the strengths (and limitations) of syntax-based approaches, including their ability to generalize phrase-based translation units, their modeling of specific linguistic phenomena, and their function of structuring the search space.


Book Synopsis Syntax-based Statistical Machine Translation by : Philip Williams

Download or read book Syntax-based Statistical Machine Translation written by Philip Williams and published by Springer Nature. This book was released on 2022-05-31 with total page 190 pages. Available in PDF, EPUB and Kindle. Book excerpt: This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations such as beam search and cube pruning, data structures, and parsing algorithms. The book consistently highlights the strengths (and limitations) of syntax-based approaches, including their ability to generalize phrase-based translation units, their modeling of specific linguistic phenomena, and their function of structuring the search space.


Statistical Machine Translation

Statistical Machine Translation

Author: Philipp Koehn

Publisher:

Published: 2010

Total Pages: 433

ISBN-13: 9780511690587

DOWNLOAD EBOOK


Book Synopsis Statistical Machine Translation by : Philipp Koehn

Download or read book Statistical Machine Translation written by Philipp Koehn and published by . This book was released on 2010 with total page 433 pages. Available in PDF, EPUB and Kindle. Book excerpt:


Hybrid Approaches to Machine Translation

Hybrid Approaches to Machine Translation

Author: Marta R. Costa-jussà

Publisher: Springer

Published: 2016-07-12

Total Pages: 208

ISBN-13: 3319213113

DOWNLOAD EBOOK

This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also – in the wider fields of Computational Linguistics, Machine Learning and Data Mining – to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.


Book Synopsis Hybrid Approaches to Machine Translation by : Marta R. Costa-jussà

Download or read book Hybrid Approaches to Machine Translation written by Marta R. Costa-jussà and published by Springer. This book was released on 2016-07-12 with total page 208 pages. Available in PDF, EPUB and Kindle. Book excerpt: This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also – in the wider fields of Computational Linguistics, Machine Learning and Data Mining – to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.


Quality Estimation for Machine Translation

Quality Estimation for Machine Translation

Author: Lucia Specia

Publisher: Springer Nature

Published: 2022-05-31

Total Pages: 148

ISBN-13: 3031021681

DOWNLOAD EBOOK

Many applications within natural language processing involve performing text-to-text transformations, i.e., given a text in natural language as input, systems are required to produce a version of this text (e.g., a translation), also in natural language, as output. Automatically evaluating the output of such systems is an important component in developing text-to-text applications. Two approaches have been proposed for this problem: (i) to compare the system outputs against one or more reference outputs using string matching-based evaluation metrics and (ii) to build models based on human feedback to predict the quality of system outputs without reference texts. Despite their popularity, reference-based evaluation metrics are faced with the challenge that multiple good (and bad) quality outputs can be produced by text-to-text approaches for the same input. This variation is very hard to capture, even with multiple reference texts. In addition, reference-based metrics cannot be used in production (e.g., online machine translation systems), when systems are expected to produce outputs for any unseen input. In this book, we focus on the second set of metrics, so-called Quality Estimation (QE) metrics, where the goal is to provide an estimate on how good or reliable the texts produced by an application are without access to gold-standard outputs. QE enables different types of evaluation that can target different types of users and applications. Machine learning techniques are used to build QE models with various types of quality labels and explicit features or learnt representations, which can then predict the quality of unseen system outputs. This book describes the topic of QE for text-to-text applications, covering quality labels, features, algorithms, evaluation, uses, and state-of-the-art approaches. It focuses on machine translation as application, since this represents most of the QE work done to date. It also briefly describes QE for several other applications, including text simplification, text summarization, grammatical error correction, and natural language generation.


Book Synopsis Quality Estimation for Machine Translation by : Lucia Specia

Download or read book Quality Estimation for Machine Translation written by Lucia Specia and published by Springer Nature. This book was released on 2022-05-31 with total page 148 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many applications within natural language processing involve performing text-to-text transformations, i.e., given a text in natural language as input, systems are required to produce a version of this text (e.g., a translation), also in natural language, as output. Automatically evaluating the output of such systems is an important component in developing text-to-text applications. Two approaches have been proposed for this problem: (i) to compare the system outputs against one or more reference outputs using string matching-based evaluation metrics and (ii) to build models based on human feedback to predict the quality of system outputs without reference texts. Despite their popularity, reference-based evaluation metrics are faced with the challenge that multiple good (and bad) quality outputs can be produced by text-to-text approaches for the same input. This variation is very hard to capture, even with multiple reference texts. In addition, reference-based metrics cannot be used in production (e.g., online machine translation systems), when systems are expected to produce outputs for any unseen input. In this book, we focus on the second set of metrics, so-called Quality Estimation (QE) metrics, where the goal is to provide an estimate on how good or reliable the texts produced by an application are without access to gold-standard outputs. QE enables different types of evaluation that can target different types of users and applications. Machine learning techniques are used to build QE models with various types of quality labels and explicit features or learnt representations, which can then predict the quality of unseen system outputs. This book describes the topic of QE for text-to-text applications, covering quality labels, features, algorithms, evaluation, uses, and state-of-the-art approaches. It focuses on machine translation as application, since this represents most of the QE work done to date. It also briefly describes QE for several other applications, including text simplification, text summarization, grammatical error correction, and natural language generation.


Introducing linguistic knowledge into statistical machine translation[

Introducing linguistic knowledge into statistical machine translation[

Author: Adrià Gispert Ramis

Publisher:

Published: 2007

Total Pages:

ISBN-13: 9788469055632

DOWNLOAD EBOOK


Book Synopsis Introducing linguistic knowledge into statistical machine translation[ by : Adrià Gispert Ramis

Download or read book Introducing linguistic knowledge into statistical machine translation[ written by Adrià Gispert Ramis and published by . This book was released on 2007 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:


A Machine Translation Approach to Cross Language Text Retrieval

A Machine Translation Approach to Cross Language Text Retrieval

Author: María Gabriela Fernandez-Diaz

Publisher: Universal-Publishers

Published: 2005-03

Total Pages: 137

ISBN-13: 1581122675

DOWNLOAD EBOOK

Cross Language Text Retrieval (CLTR) has been defined as the retrieval of documents in a language different from that of the original query. To make this possible some kind of mechanism has to be applied in order to translate the information contained in the source sentence. Many different approaches have been carried out with the purpose of transferring the information from the source language query to the target language one. Though all these methods deal with a way of translating as much information as possible from the source query, little research has been conducted in relation to the field of Machine Translation (MT). The purpose of this research work is to determine the feasibility of using MT techniques for CLTR. Specifically, I will describe how a MT system has been adapted without much effort to translate Spanish queries of a specific domain, i.e. Finance and Economics, into English in order to retrieve documents related to that field. The results of this process will then be compared with the results obtained from the retrieval of the original English queries. Thus, I will discuss the advantages and disadvantages of using MT for CLTR.


Book Synopsis A Machine Translation Approach to Cross Language Text Retrieval by : María Gabriela Fernandez-Diaz

Download or read book A Machine Translation Approach to Cross Language Text Retrieval written by María Gabriela Fernandez-Diaz and published by Universal-Publishers. This book was released on 2005-03 with total page 137 pages. Available in PDF, EPUB and Kindle. Book excerpt: Cross Language Text Retrieval (CLTR) has been defined as the retrieval of documents in a language different from that of the original query. To make this possible some kind of mechanism has to be applied in order to translate the information contained in the source sentence. Many different approaches have been carried out with the purpose of transferring the information from the source language query to the target language one. Though all these methods deal with a way of translating as much information as possible from the source query, little research has been conducted in relation to the field of Machine Translation (MT). The purpose of this research work is to determine the feasibility of using MT techniques for CLTR. Specifically, I will describe how a MT system has been adapted without much effort to translate Spanish queries of a specific domain, i.e. Finance and Economics, into English in order to retrieve documents related to that field. The results of this process will then be compared with the results obtained from the retrieval of the original English queries. Thus, I will discuss the advantages and disadvantages of using MT for CLTR.


Neural Machine Translation

Neural Machine Translation

Author: Philipp Koehn

Publisher: Cambridge University Press

Published: 2020-06-18

Total Pages: 410

ISBN-13: 1108601766

DOWNLOAD EBOOK

Deep learning is revolutionizing how machine translation systems are built today. This book introduces the challenge of machine translation and evaluation - including historical, linguistic, and applied context -- then develops the core deep learning methods used for natural language applications. Code examples in Python give readers a hands-on blueprint for understanding and implementing their own machine translation systems. The book also provides extensive coverage of machine learning tricks, issues involved in handling various forms of data, model enhancements, and current challenges and methods for analysis and visualization. Summaries of the current research in the field make this a state-of-the-art textbook for undergraduate and graduate classes, as well as an essential reference for researchers and developers interested in other applications of neural methods in the broader field of human language processing.


Book Synopsis Neural Machine Translation by : Philipp Koehn

Download or read book Neural Machine Translation written by Philipp Koehn and published by Cambridge University Press. This book was released on 2020-06-18 with total page 410 pages. Available in PDF, EPUB and Kindle. Book excerpt: Deep learning is revolutionizing how machine translation systems are built today. This book introduces the challenge of machine translation and evaluation - including historical, linguistic, and applied context -- then develops the core deep learning methods used for natural language applications. Code examples in Python give readers a hands-on blueprint for understanding and implementing their own machine translation systems. The book also provides extensive coverage of machine learning tricks, issues involved in handling various forms of data, model enhancements, and current challenges and methods for analysis and visualization. Summaries of the current research in the field make this a state-of-the-art textbook for undergraduate and graduate classes, as well as an essential reference for researchers and developers interested in other applications of neural methods in the broader field of human language processing.


Human Language Technologies – The Baltic Perspective

Human Language Technologies – The Baltic Perspective

Author: I. Skadiņa

Publisher: IOS Press

Published: 2016-10-14

Total Pages: 188

ISBN-13: 1614997012

DOWNLOAD EBOOK

Throughout the last decade, the Baltic states have played an active role in regional and international language technology activities, supporting less-resourced languages in the digital age. This book presents the proceedings of the 7th International Conference: Human Language Technologies – The Baltic Perspective (Baltic HLT 2016), held in Riga, Latvia, in October 2016. Baltic HLT 2016 provided a forum for sharing ideas and recent advances in human language processing with a special focus on less-resourced languages. Papers selected for the conference cover a wide range of topics, including a general overview of language technology progress in the Baltic states, actual research topics in written and spoken language processing, the creation of language resources and their applications, and proposals for a European language platform. The book is divided into five sections: overview; speech technologies and corpora; machine translation; written language resources; and methods and tools for language processing. The book will be a useful resource, not only for Baltic language researchers, but also for those working with other less-resourced languages in Europe and beyond.


Book Synopsis Human Language Technologies – The Baltic Perspective by : I. Skadiņa

Download or read book Human Language Technologies – The Baltic Perspective written by I. Skadiņa and published by IOS Press. This book was released on 2016-10-14 with total page 188 pages. Available in PDF, EPUB and Kindle. Book excerpt: Throughout the last decade, the Baltic states have played an active role in regional and international language technology activities, supporting less-resourced languages in the digital age. This book presents the proceedings of the 7th International Conference: Human Language Technologies – The Baltic Perspective (Baltic HLT 2016), held in Riga, Latvia, in October 2016. Baltic HLT 2016 provided a forum for sharing ideas and recent advances in human language processing with a special focus on less-resourced languages. Papers selected for the conference cover a wide range of topics, including a general overview of language technology progress in the Baltic states, actual research topics in written and spoken language processing, the creation of language resources and their applications, and proposals for a European language platform. The book is divided into five sections: overview; speech technologies and corpora; machine translation; written language resources; and methods and tools for language processing. The book will be a useful resource, not only for Baltic language researchers, but also for those working with other less-resourced languages in Europe and beyond.