Linguistic Corpora and Big Data in Spanish and Portuguese
(eBook)
Contributors
Bonilla, Johnatan E., Contributor
Bouzouita, Miriam, Contributor
Calderón Campos, Miguel, Contributor
Calderón Campos, Miguel, Editor
Campos, Miguel Calderón, Contributor
Bouzouita, Miriam, Contributor
Calderón Campos, Miguel, Contributor
Calderón Campos, Miguel, Editor
Campos, Miguel Calderón, Contributor
Published
Berlin ; De Gruyter,, [2024].
Format
eBook
ISBN
9783110781465
Status
Description
Loading Description...
More Details
Language
English
UPC
10.1515/9783110781465
Notes
Restrictions on Access
Open Access https://purl.org/coar/access_right/c_abf2 d star
Description
In recent decades, corpus linguistics has experienced tremendous development in the Hispanic world, along two opposite but complementary approaches: increase in corpus size (corpus linguistics as Big Data) and improvement in document selection and data annotation (corpus linguistics as High Quality Data). The first approach has led to the creation of massive corpora such as EsTenTen; at the same time, it has promoted the use of the web and social networks as corpora. The second perspective gives rise to specialized corpora such as Post Scriptum or Oralia Diacrónica del español (ODE). The contributions gathered in this volume combine both methods in order to exploit their advantages and to overcome their possible limitations. On the one hand, it addresses the creation and design of small corpora focused on data quality; on the other hand, it offers case studies that make use of both specialized corpora and massive data extracted from the web. Highlighting the complementary nature of both methods is the main idea of this book.
Additional Physical Form
Issued also in print.
System Details
Mode of access: Internet via World Wide Web.
Terms Governing Use and Reproduction
This eBook is made available Open Access under a CC BY-NC-ND 4.0 license: https://creativecommons.org/licenses/by-nc-nd/4.0
Language
In English.