History, Features, and Typology of Language Corpora

History, Features, and Typology of Language Corpora

Author: Niladri Sekhar Dash

Publisher: Springer

Published: 2018-02-01

Total Pages: 293

ISBN-13: 9811074585

DOWNLOAD EBOOK

This book discusses key issues of corpus linguistics like the definition of the corpus, primary features of a corpus, and utilization and limitations of corpora. It presents a unique classification scheme of language corpora to show how they can be studied from the perspective of genre, nature, text type, purpose, and application. A reference to parallel translation corpus is mandatory in the discussion of corpus generation, which the authors thoroughly address here, with a focus on Indian language corpora and English. Web-text corpus, a new development in corpus linguistics, is also discussed with elaborate reference to Indian web text corpora. The book also presents a short history of corpus generation and provides scenarios before and after the advent of computer-generated digital corpora. This book has several important features: it discusses many technical issues of the field in a lucid manner; contains extensive new diagrams and charts for easy comprehension; and presents discussions in simplified English to cater to the needs of non-native English readers. This is an important resource authored by academics who have many years of experience teaching and researching corpus linguistics. Its focus on Indian languages and on English corpora makes it applicable to students of graduate and postgraduate courses in applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.


Book Synopsis History, Features, and Typology of Language Corpora by : Niladri Sekhar Dash

Download or read book History, Features, and Typology of Language Corpora written by Niladri Sekhar Dash and published by Springer. This book was released on 2018-02-01 with total page 293 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book discusses key issues of corpus linguistics like the definition of the corpus, primary features of a corpus, and utilization and limitations of corpora. It presents a unique classification scheme of language corpora to show how they can be studied from the perspective of genre, nature, text type, purpose, and application. A reference to parallel translation corpus is mandatory in the discussion of corpus generation, which the authors thoroughly address here, with a focus on Indian language corpora and English. Web-text corpus, a new development in corpus linguistics, is also discussed with elaborate reference to Indian web text corpora. The book also presents a short history of corpus generation and provides scenarios before and after the advent of computer-generated digital corpora. This book has several important features: it discusses many technical issues of the field in a lucid manner; contains extensive new diagrams and charts for easy comprehension; and presents discussions in simplified English to cater to the needs of non-native English readers. This is an important resource authored by academics who have many years of experience teaching and researching corpus linguistics. Its focus on Indian languages and on English corpora makes it applicable to students of graduate and postgraduate courses in applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.


New Methods in Historical Corpora

New Methods in Historical Corpora

Author: Paul Durrell, Martin Scheible, Silke Whitt, Richard J. Bennett

Publisher: BoD – Books on Demand

Published: 2013-09-22

Total Pages: 286

ISBN-13: 3823367609

DOWNLOAD EBOOK

Investigating the history of a language depends on fragmentary sources, but electronic corpora offer the possibility of alleviating the problem of 'bad data'. But they cannot overcome it totally, and questions arise of the optimal architecture for a corpus and its representativeness of actual language use, and how a historical corpus can best be annotated to maximize its usefulness. Immense strides have been made in recent years in addressing these questions, with exciting new methods and technological advances. The papers in this volume, which were presented at a conference on New Methods in Historical Corpora (Manchester 2011), exemplify the wide range of these recent developments.


Book Synopsis New Methods in Historical Corpora by : Paul Durrell, Martin Scheible, Silke Whitt, Richard J. Bennett

Download or read book New Methods in Historical Corpora written by Paul Durrell, Martin Scheible, Silke Whitt, Richard J. Bennett and published by BoD – Books on Demand. This book was released on 2013-09-22 with total page 286 pages. Available in PDF, EPUB and Kindle. Book excerpt: Investigating the history of a language depends on fragmentary sources, but electronic corpora offer the possibility of alleviating the problem of 'bad data'. But they cannot overcome it totally, and questions arise of the optimal architecture for a corpus and its representativeness of actual language use, and how a historical corpus can best be annotated to maximize its usefulness. Immense strides have been made in recent years in addressing these questions, with exciting new methods and technological advances. The papers in this volume, which were presented at a conference on New Methods in Historical Corpora (Manchester 2011), exemplify the wide range of these recent developments.


Understanding Corpus Linguistics

Understanding Corpus Linguistics

Author: Danielle Barth

Publisher: Routledge

Published: 2021-11-18

Total Pages: 276

ISBN-13: 1000466752

DOWNLOAD EBOOK

This textbook introduces the fundamental concepts and methods of corpus linguistics for students approaching this topic for the first time, putting specific emphasis on the enormous linguistic diversity represented by approximately 7,000 human languages and broadening the scope of current concerns in general corpus linguistics. Including a basic toolkit to help the reader investigate language in different usage contexts, this book: Shows the relevance of corpora to a range of linguistic areas from phonology to sociolinguistics and discourse Covers recent developments in the application of corpus linguistics to the study of understudied languages and linguistic typology Features exercises, short problems, and questions Includes examples from real studies in over 15 languages plus multilingual corpora Providing the necessary corpus linguistics skills to critically evaluate and replicate studies, this book is essential reading for anyone studying corpus linguistics.


Book Synopsis Understanding Corpus Linguistics by : Danielle Barth

Download or read book Understanding Corpus Linguistics written by Danielle Barth and published by Routledge. This book was released on 2021-11-18 with total page 276 pages. Available in PDF, EPUB and Kindle. Book excerpt: This textbook introduces the fundamental concepts and methods of corpus linguistics for students approaching this topic for the first time, putting specific emphasis on the enormous linguistic diversity represented by approximately 7,000 human languages and broadening the scope of current concerns in general corpus linguistics. Including a basic toolkit to help the reader investigate language in different usage contexts, this book: Shows the relevance of corpora to a range of linguistic areas from phonology to sociolinguistics and discourse Covers recent developments in the application of corpus linguistics to the study of understudied languages and linguistic typology Features exercises, short problems, and questions Includes examples from real studies in over 15 languages plus multilingual corpora Providing the necessary corpus linguistics skills to critically evaluate and replicate studies, this book is essential reading for anyone studying corpus linguistics.


Developing Linguistic Corpora

Developing Linguistic Corpora

Author: Martin Wynne

Publisher: Oxbow Books Limited

Published: 2005

Total Pages: 100

ISBN-13:

DOWNLOAD EBOOK

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.


Book Synopsis Developing Linguistic Corpora by : Martin Wynne

Download or read book Developing Linguistic Corpora written by Martin Wynne and published by Oxbow Books Limited. This book was released on 2005 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.


Introducing Linguistic Research

Introducing Linguistic Research

Author: Svenja Voelkel

Publisher: Cambridge University Press

Published: 2021-09-09

Total Pages: 413

ISBN-13: 1316946533

DOWNLOAD EBOOK

Over the past decade, conducting empirical research in linguistics has become increasingly popular. The first of its kind, this book provides an engaging and practical introduction to this exciting versatile field, providing a comprehensive overview of research aspects in general, and covering a broad range of subdiscipline-specific methodological approaches. Subfields covered include language documentation and descriptive linguistics, language typology, corpus linguistics, sociolinguistics and anthropological linguistics, cognitive linguistics and psycholinguistics, and neurolinguistics. The book reflects on the strengths and weaknesses of each single approach and on how they interact with one-another across the study of language in its many diverse facets. It also includes exercises, example student projects and recommendations for further reading, along with additional online teaching materials. Providing hands-on experience, and written in an engaging and accessible style, this unique and comprehensive guide will give students the inspiration they need to develop their own research projects in empirical linguistics.


Book Synopsis Introducing Linguistic Research by : Svenja Voelkel

Download or read book Introducing Linguistic Research written by Svenja Voelkel and published by Cambridge University Press. This book was released on 2021-09-09 with total page 413 pages. Available in PDF, EPUB and Kindle. Book excerpt: Over the past decade, conducting empirical research in linguistics has become increasingly popular. The first of its kind, this book provides an engaging and practical introduction to this exciting versatile field, providing a comprehensive overview of research aspects in general, and covering a broad range of subdiscipline-specific methodological approaches. Subfields covered include language documentation and descriptive linguistics, language typology, corpus linguistics, sociolinguistics and anthropological linguistics, cognitive linguistics and psycholinguistics, and neurolinguistics. The book reflects on the strengths and weaknesses of each single approach and on how they interact with one-another across the study of language in its many diverse facets. It also includes exercises, example student projects and recommendations for further reading, along with additional online teaching materials. Providing hands-on experience, and written in an engaging and accessible style, this unique and comprehensive guide will give students the inspiration they need to develop their own research projects in empirical linguistics.


Corpus-based Perspectives in Linguistics

Corpus-based Perspectives in Linguistics

Author: Yuji Kawaguchi

Publisher: John Benjamins Publishing

Published: 2007

Total Pages: 464

ISBN-13: 9789027233189

DOWNLOAD EBOOK

UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics – Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.


Book Synopsis Corpus-based Perspectives in Linguistics by : Yuji Kawaguchi

Download or read book Corpus-based Perspectives in Linguistics written by Yuji Kawaguchi and published by John Benjamins Publishing. This book was released on 2007 with total page 464 pages. Available in PDF, EPUB and Kindle. Book excerpt: UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics – Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.


Corpus Linguistics: An Introduction

Corpus Linguistics: An Introduction

Author: Dash, Niladri Sekhar

Publisher: Pearson Education India

Published: 2008

Total Pages: 208

ISBN-13: 8131752623

DOWNLOAD EBOOK

Corpus Linguistics: An Introduction will appeal to a wide spectrum of scholars, researchers, and particularly to students of linguistics. It offers guidelines for the creation and usage of corpora in the form of empirical language databases with direct functional and theoretical interpretation of a natural language. Drawn from original research and written in an accessible language and style, this book will create avenues for further advancements in mainstream and applied linguistics and language technology.


Book Synopsis Corpus Linguistics: An Introduction by : Dash, Niladri Sekhar

Download or read book Corpus Linguistics: An Introduction written by Dash, Niladri Sekhar and published by Pearson Education India. This book was released on 2008 with total page 208 pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpus Linguistics: An Introduction will appeal to a wide spectrum of scholars, researchers, and particularly to students of linguistics. It offers guidelines for the creation and usage of corpora in the form of empirical language databases with direct functional and theoretical interpretation of a natural language. Drawn from original research and written in an accessible language and style, this book will create avenues for further advancements in mainstream and applied linguistics and language technology.


English Corpus Linguistics: Variation in Time, Space and Genre.

English Corpus Linguistics: Variation in Time, Space and Genre.

Author: Gisle Andersen

Publisher: Rodopi

Published: 2013

Total Pages: 256

ISBN-13: 9401209405

DOWNLOAD EBOOK

As its title suggests, this book is a selection of papers that use English corpora to study language variation along three dimensions – time, place and genre. In broad terms, the book aims to bridge the gap between corpus linguistics and sociolinguistics and to increase our knowledge of the characteristics of English language. It includes eleven papers which address a variety of research questions but with the commonality of a corpus-based methodology. Some of the contributions deal with language variation in time, either by looking into historical corpora of English or by adopting the method known as diachronic comparable corpus linguistics, thus illustrating how corpora can be used to illuminate either historical or recent developments of English. Other studies investigate variation in space by comparing different varieties of English, including some of the “New Englishes” such as the South Asian varieties of English. Finally, some of the papers deal with variation in genre, by looking into the use of language for specific purposes through the inspection of medical articles, social reports and academic writing.


Book Synopsis English Corpus Linguistics: Variation in Time, Space and Genre. by : Gisle Andersen

Download or read book English Corpus Linguistics: Variation in Time, Space and Genre. written by Gisle Andersen and published by Rodopi. This book was released on 2013 with total page 256 pages. Available in PDF, EPUB and Kindle. Book excerpt: As its title suggests, this book is a selection of papers that use English corpora to study language variation along three dimensions – time, place and genre. In broad terms, the book aims to bridge the gap between corpus linguistics and sociolinguistics and to increase our knowledge of the characteristics of English language. It includes eleven papers which address a variety of research questions but with the commonality of a corpus-based methodology. Some of the contributions deal with language variation in time, either by looking into historical corpora of English or by adopting the method known as diachronic comparable corpus linguistics, thus illustrating how corpora can be used to illuminate either historical or recent developments of English. Other studies investigate variation in space by comparing different varieties of English, including some of the “New Englishes” such as the South Asian varieties of English. Finally, some of the papers deal with variation in genre, by looking into the use of language for specific purposes through the inspection of medical articles, social reports and academic writing.


Teaching and Language Corpora

Teaching and Language Corpora

Author: Anne Wichmann

Publisher: Routledge

Published: 2014-06-11

Total Pages: 362

ISBN-13: 1317889584

DOWNLOAD EBOOK

Corpora are well-established as a resource for language research; they are now also increasingly being used for teaching purposes. This book is the first of its kind to deal explicitly and in a wide-ranging way with the use of corpora in teaching. It contains an extensive collection of articles by corpus linguists and practising teachers, covering not only the use of data to inform and create teaching materials but also the direct exploitation of corpora by students, both in the study of linguistics in general and in the acquisition of proficiency in individual languages, including English, Welsh, German, French and Italian. In addition, the book offers practical information on the sources of corpora and concordances, including those suitable for work on non-roman scripts such as Greek and Cyrillic.


Book Synopsis Teaching and Language Corpora by : Anne Wichmann

Download or read book Teaching and Language Corpora written by Anne Wichmann and published by Routledge. This book was released on 2014-06-11 with total page 362 pages. Available in PDF, EPUB and Kindle. Book excerpt: Corpora are well-established as a resource for language research; they are now also increasingly being used for teaching purposes. This book is the first of its kind to deal explicitly and in a wide-ranging way with the use of corpora in teaching. It contains an extensive collection of articles by corpus linguists and practising teachers, covering not only the use of data to inform and create teaching materials but also the direct exploitation of corpora by students, both in the study of linguistics in general and in the acquisition of proficiency in individual languages, including English, Welsh, German, French and Italian. In addition, the book offers practical information on the sources of corpora and concordances, including those suitable for work on non-roman scripts such as Greek and Cyrillic.


Using Corpora to Explore Linguistic Variation

Using Corpora to Explore Linguistic Variation

Author: Randi Reppen

Publisher: John Benjamins Publishing

Published: 2002-11-29

Total Pages: 289

ISBN-13: 9027296162

DOWNLOAD EBOOK

Using Corpora to Explore Linguistic Variation illustrates the ways in which linguistic variation can be explored through corpus-based investigation. Two major kinds of research questions are considered: variation in the use of a particular linguistic feature, and variation across dialects or registers. Part 1: “Exploring variation in the use of linguistic features” focuses on the study of specific words, expressions, or grammatical constructions, to study variation in the use of a particular linguistic feature. Part 2: “Exploring dialect and register variation” describes salient characteristics of dialects or registers and the patterns of variation across varieties. Part 3: “Exploring Historical Variation” applies these same two major perspectives to historical variation. One recurring theme is the extent to which linguistic variation depends on register differences, reflecting the importance of register as a key methodological and thematic concern in current corpus linguistic research.


Book Synopsis Using Corpora to Explore Linguistic Variation by : Randi Reppen

Download or read book Using Corpora to Explore Linguistic Variation written by Randi Reppen and published by John Benjamins Publishing. This book was released on 2002-11-29 with total page 289 pages. Available in PDF, EPUB and Kindle. Book excerpt: Using Corpora to Explore Linguistic Variation illustrates the ways in which linguistic variation can be explored through corpus-based investigation. Two major kinds of research questions are considered: variation in the use of a particular linguistic feature, and variation across dialects or registers. Part 1: “Exploring variation in the use of linguistic features” focuses on the study of specific words, expressions, or grammatical constructions, to study variation in the use of a particular linguistic feature. Part 2: “Exploring dialect and register variation” describes salient characteristics of dialects or registers and the patterns of variation across varieties. Part 3: “Exploring Historical Variation” applies these same two major perspectives to historical variation. One recurring theme is the extent to which linguistic variation depends on register differences, reflecting the importance of register as a key methodological and thematic concern in current corpus linguistic research.