Developing Linguistic Corpora

Developing Linguistic Corpora

Author: Martin Wynne

Publisher: Oxbow Books Limited

Published: 2005

Total Pages: 100

ISBN-13:

DOWNLOAD EBOOK

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.


Book Synopsis Developing Linguistic Corpora by : Martin Wynne

Download or read book Developing Linguistic Corpora written by Martin Wynne and published by Oxbow Books Limited. This book was released on 2005 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.


Corpora in Language Acquisition Research

Corpora in Language Acquisition Research

Author: Heike Behrens

Publisher: John Benjamins Publishing

Published: 2008

Total Pages: 280

ISBN-13: 9789027234766

DOWNLOAD EBOOK

Corpus research forms the backbone of research on children's language development. Leading researchers in the field present a survey on the history of data collection, different types of data, and the treatment of methodological problems. Morphologically and syntactically parsed corpora allow for the concise explorations of formal phenomena, the quick retrieval of errors, and reliability checks. New probabilistic and connectionist computations investigate how children integrate the multiple sources of information available in the input, and new statistical methods compute rates of acquisition as well as error rates dependent on sample size. Sample analyses show how multi-modal corpora are used to investigate the interaction of discourse and linguistic structure, how cross-linguistic generalizations for acquisition can be formulated and tested, and how individual variation can be explored. Finally, ways in which corpus research interacts with computational linguistics and experimental research are presented.


Book Synopsis Corpora in Language Acquisition Research by : Heike Behrens

Download or read book Corpora in Language Acquisition Research written by Heike Behrens and published by John Benjamins Publishing. This book was released on 2008 with total page 280 pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpus research forms the backbone of research on children's language development. Leading researchers in the field present a survey on the history of data collection, different types of data, and the treatment of methodological problems. Morphologically and syntactically parsed corpora allow for the concise explorations of formal phenomena, the quick retrieval of errors, and reliability checks. New probabilistic and connectionist computations investigate how children integrate the multiple sources of information available in the input, and new statistical methods compute rates of acquisition as well as error rates dependent on sample size. Sample analyses show how multi-modal corpora are used to investigate the interaction of discourse and linguistic structure, how cross-linguistic generalizations for acquisition can be formulated and tested, and how individual variation can be explored. Finally, ways in which corpus research interacts with computational linguistics and experimental research are presented.


Creating and Digitizing Language Corpora

Creating and Digitizing Language Corpora

Author: J. Beal

Publisher: Palgrave Macmillan

Published: 2007-06-27

Total Pages: 245

ISBN-13: 9781403943668

DOWNLOAD EBOOK

A range of electronic corpora is increasingly accessible via the WWW and CD-ROM. This development coincided with improved standards governing the collecting, encoding and archiving of such data. This book looks at developing similar standards for enriching and preserving unconventional data: dialects, child language and bilingual databases.


Book Synopsis Creating and Digitizing Language Corpora by : J. Beal

Download or read book Creating and Digitizing Language Corpora written by J. Beal and published by Palgrave Macmillan. This book was released on 2007-06-27 with total page 245 pages. Available in PDF, EPUB and Kindle. Book excerpt: A range of electronic corpora is increasingly accessible via the WWW and CD-ROM. This development coincided with improved standards governing the collecting, encoding and archiving of such data. This book looks at developing similar standards for enriching and preserving unconventional data: dialects, child language and bilingual databases.


Language Corpora Annotation and Processing

Language Corpora Annotation and Processing

Author: Niladri Sekhar Dash

Publisher: Springer Nature

Published: 2021

Total Pages:

ISBN-13: 9811629609

DOWNLOAD EBOOK

This book addresses the research, analysis, and description of the methods and processes that are used in the annotation and processing of language corpora in advanced, semi-advanced, and non-advanced languages. It provides the background information and empirical data needed to understand the nature and depth of problems related to corpus annotation and text processing and shows readers how the linguistic elements found in texts are analyzed and applied to develop language technology systems and devices. As such, it offers valuable insights for researchers, educators, and students of linguistics and language technology.


Book Synopsis Language Corpora Annotation and Processing by : Niladri Sekhar Dash

Download or read book Language Corpora Annotation and Processing written by Niladri Sekhar Dash and published by Springer Nature. This book was released on 2021 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This book addresses the research, analysis, and description of the methods and processes that are used in the annotation and processing of language corpora in advanced, semi-advanced, and non-advanced languages. It provides the background information and empirical data needed to understand the nature and depth of problems related to corpus annotation and text processing and shows readers how the linguistic elements found in texts are analyzed and applied to develop language technology systems and devices. As such, it offers valuable insights for researchers, educators, and students of linguistics and language technology.


Building a National Corpus

Building a National Corpus

Author: Dawn Knight

Publisher: Springer Nature

Published: 2021-10-08

Total Pages: 192

ISBN-13: 3030818586

DOWNLOAD EBOOK

This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.


Book Synopsis Building a National Corpus by : Dawn Knight

Download or read book Building a National Corpus written by Dawn Knight and published by Springer Nature. This book was released on 2021-10-08 with total page 192 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.


Advances in Corpus Linguistics

Advances in Corpus Linguistics

Author:

Publisher: BRILL

Published: 2016-09-12

Total Pages: 429

ISBN-13: 9004333711

DOWNLOAD EBOOK

This book provides an up-to-date survey of current issues and approaches in corpus linguistics in the form of twenty-two recent research articles. The articles cover a wide range of topics illustrating the diversity of research that is characteristic of corpus linguistics today. Central themes are the relationship between theory, intuition and corpus data and the role of corpora in linguistic research. The majority of the articles are empirical studies of specific aspects of English, ranging from lexis and grammar to discourse and pragmatics. Other areas explored are language variation, language change and development, language learning, cross-linguistic comparisons of English and other languages, and the development of linguistic software tools. The contributors to the volume include some of the leading figures in the field such as M.A.K. Halliday, John Sinclair, Geoffrey Leech and Michael Hoey. The theoretical and methodological issues addressed in the volume demonstrate clearly the steady advance of an expanding discipline inspired by an empirical, usage-based approach to the study of language. The volume is essential reading for researchers and students interested in the use of computer corpora in linguistic research.


Book Synopsis Advances in Corpus Linguistics by :

Download or read book Advances in Corpus Linguistics written by and published by BRILL. This book was released on 2016-09-12 with total page 429 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides an up-to-date survey of current issues and approaches in corpus linguistics in the form of twenty-two recent research articles. The articles cover a wide range of topics illustrating the diversity of research that is characteristic of corpus linguistics today. Central themes are the relationship between theory, intuition and corpus data and the role of corpora in linguistic research. The majority of the articles are empirical studies of specific aspects of English, ranging from lexis and grammar to discourse and pragmatics. Other areas explored are language variation, language change and development, language learning, cross-linguistic comparisons of English and other languages, and the development of linguistic software tools. The contributors to the volume include some of the leading figures in the field such as M.A.K. Halliday, John Sinclair, Geoffrey Leech and Michael Hoey. The theoretical and methodological issues addressed in the volume demonstrate clearly the steady advance of an expanding discipline inspired by an empirical, usage-based approach to the study of language. The volume is essential reading for researchers and students interested in the use of computer corpora in linguistic research.


Spoken Corpora and Linguistic Studies

Spoken Corpora and Linguistic Studies

Author: Tommaso Raso

Publisher: John Benjamins Publishing Company

Published: 2014-11-14

Total Pages: 508

ISBN-13: 9027270031

DOWNLOAD EBOOK

The authors of this book share a common interest in the following topics: the importance of corpora compilation for the empirical study of human language; the importance of pragmatic categories such as emotion, attitude, illocution and information structure in linguistic theory; and a passionate belief in the central role of prosody for the analysis of speech. Four distinct sections (spoken corpora compilation; spoken corpora annotation; prosody; and syntax and information structure) give the book the structure in which the authors present innovative methodologies that focus on the compilation of third generation spoken corpora; multilevel spoken corpora annotation and its functions; and additionally a debate is initiated about the reference unit in the study of spoken language via information structure. The book is accompanied by a web site with a rich array of audio/video files. The web site can be found at the following address: DOI: 10.1075/scl.61.media


Book Synopsis Spoken Corpora and Linguistic Studies by : Tommaso Raso

Download or read book Spoken Corpora and Linguistic Studies written by Tommaso Raso and published by John Benjamins Publishing Company. This book was released on 2014-11-14 with total page 508 pages. Available in PDF, EPUB and Kindle. Book excerpt: The authors of this book share a common interest in the following topics: the importance of corpora compilation for the empirical study of human language; the importance of pragmatic categories such as emotion, attitude, illocution and information structure in linguistic theory; and a passionate belief in the central role of prosody for the analysis of speech. Four distinct sections (spoken corpora compilation; spoken corpora annotation; prosody; and syntax and information structure) give the book the structure in which the authors present innovative methodologies that focus on the compilation of third generation spoken corpora; multilevel spoken corpora annotation and its functions; and additionally a debate is initiated about the reference unit in the study of spoken language via information structure. The book is accompanied by a web site with a rich array of audio/video files. The web site can be found at the following address: DOI: 10.1075/scl.61.media


An Introduction to Corpus Linguistics

An Introduction to Corpus Linguistics

Author: Graeme Kennedy

Publisher: Routledge

Published: 2014-09-19

Total Pages: 328

ISBN-13: 1317892585

DOWNLOAD EBOOK

The use of large, computerized bodies of text for linguistic analysis and description has emerged in recent years as one of the most significant and rapidly-developing fields of activity in the study of language. This book provides a comprehensive introduction and guide to Corpus Linguistics. All aspects of the field are explored, from the various types of electronic corpora that are available to instructions on how to design and compile a corpus. Graeme Kennedy surveys the development of corpora for use in linguistic research, looking back to the pre-electronic age as well as to the massive growth of computer corpora in the electronic age.


Book Synopsis An Introduction to Corpus Linguistics by : Graeme Kennedy

Download or read book An Introduction to Corpus Linguistics written by Graeme Kennedy and published by Routledge. This book was released on 2014-09-19 with total page 328 pages. Available in PDF, EPUB and Kindle. Book excerpt: The use of large, computerized bodies of text for linguistic analysis and description has emerged in recent years as one of the most significant and rapidly-developing fields of activity in the study of language. This book provides a comprehensive introduction and guide to Corpus Linguistics. All aspects of the field are explored, from the various types of electronic corpora that are available to instructions on how to design and compile a corpus. Graeme Kennedy surveys the development of corpora for use in linguistic research, looking back to the pre-electronic age as well as to the massive growth of computer corpora in the electronic age.


Corpus Linguistics

Corpus Linguistics

Author: Douglas Biber

Publisher: Cambridge University Press

Published: 1998-04-23

Total Pages:

ISBN-13: 1316582566

DOWNLOAD EBOOK

This book is about investigating the way people use language in speech and writing. It introduces the corpus-based approach to linguistics, based on analysis of large databases of real language examples stored on computer. Each chapter focuses on a different area of linguistics, including lexicography, grammar, discourse, register variation, language acquisition, and historical linguistics. Example analyses are presented in each chapter to provide concrete descriptions of the research methods and advantages of corpus-based techniques. Ten methodology boxes provide clear and concise explanations of the issues in doing corpus-based research and reading corpus-based studies and there is a useful appendix of resources for corpus-based investigation. This lucid and comprehensive introduction to the subject will be welcomed by a broad range of readers, from undergraduate students to professional researchers.


Book Synopsis Corpus Linguistics by : Douglas Biber

Download or read book Corpus Linguistics written by Douglas Biber and published by Cambridge University Press. This book was released on 1998-04-23 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This book is about investigating the way people use language in speech and writing. It introduces the corpus-based approach to linguistics, based on analysis of large databases of real language examples stored on computer. Each chapter focuses on a different area of linguistics, including lexicography, grammar, discourse, register variation, language acquisition, and historical linguistics. Example analyses are presented in each chapter to provide concrete descriptions of the research methods and advantages of corpus-based techniques. Ten methodology boxes provide clear and concise explanations of the issues in doing corpus-based research and reading corpus-based studies and there is a useful appendix of resources for corpus-based investigation. This lucid and comprehensive introduction to the subject will be welcomed by a broad range of readers, from undergraduate students to professional researchers.


Corpus Linguistics for Vocabulary

Corpus Linguistics for Vocabulary

Author: Paweł Szudarski

Publisher: Routledge

Published: 2017-09-25

Total Pages: 228

ISBN-13: 1351608045

DOWNLOAD EBOOK

Corpus Linguistics for Vocabulary provides a practical introduction to using corpus linguistics in vocabulary studies. Using freely available corpus tools, the author provides a step-by-step guide on how corpora can be used to explore key vocabulary-related research questions and topics such as: The frequency of English words and how to choose which ones should be taught to learners; How spoken vocabulary differs from written vocabulary, and how academic vocabulary differs from general vocabulary; How vocabulary contributes to the structure of discourse, and the pragmatic functions it fulfils. Featuring case studies and tasks throughout, Corpus Linguistics for Vocabulary provides a clear and accessible guide and is essential reading for students and teachers wanting to understand, appreciate and conduct corpus-based research in vocabulary studies.


Book Synopsis Corpus Linguistics for Vocabulary by : Paweł Szudarski

Download or read book Corpus Linguistics for Vocabulary written by Paweł Szudarski and published by Routledge. This book was released on 2017-09-25 with total page 228 pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpus Linguistics for Vocabulary provides a practical introduction to using corpus linguistics in vocabulary studies. Using freely available corpus tools, the author provides a step-by-step guide on how corpora can be used to explore key vocabulary-related research questions and topics such as: The frequency of English words and how to choose which ones should be taught to learners; How spoken vocabulary differs from written vocabulary, and how academic vocabulary differs from general vocabulary; How vocabulary contributes to the structure of discourse, and the pragmatic functions it fulfils. Featuring case studies and tasks throughout, Corpus Linguistics for Vocabulary provides a clear and accessible guide and is essential reading for students and teachers wanting to understand, appreciate and conduct corpus-based research in vocabulary studies.