Back to Search Start Over

Design and construction of Guayaquil radio speech corpus (CHARG).

Authors :
Sawicka-Stępińska, Brygida
Source :
Language Resources & Evaluation. Sep2023, Vol. 57 Issue 3, p1405-1422. 18p.
Publication Year :
2023

Abstract

The present paper aims to describe the process of creating CHARG—Corpus de Habla Radiofónica de Guayaquil (the Guayaquil Radiophonic Speech Corpus). It is the first systematized spoken corpus for this rather under-researched variety of Spanish. Guayaquil is the most populated city of Ecuador, while its capital city is Quito. Therefore, Ecuador is a rare case of a Spanish-speaking country with two major urban centers that belong to two separate dialectal zones, offering a very peculiar sociolinguistic context. CHARG is a corpus composed of Guayaquil radio programs. Its structure is organized by non-linguistic criteria (program type) in order to ensure a representative and balanced sample. The paper describes the design of the corpus (defining the study population, sample and stratification) and its construction (recording procedure, speakers and speech style coding, transcription and annotation). As a result, CHARG consists of 24 h of transcriptions and annotations of recordings from 142 speakers. The paper's potential use is twofold: since it presents a step-by-step procedure of corpus construction that can be replicated, the readers might be interested in both the procedure and the corpus itself as a research material. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1574020X
Volume :
57
Issue :
3
Database :
Academic Search Index
Journal :
Language Resources & Evaluation
Publication Type :
Academic Journal
Accession number :
170029265
Full Text :
https://doi.org/10.1007/s10579-023-09649-0