Back to Search Start Over

Speech-to-text Recognition for the Creation of Subtitles in Basque: An Analysis of ADITU Based on the NER Model

Authors :
Filología Inglesa y Alemana y Traducción e Interpretación
Ingeles eta Aleman Filologia eta Itzulpengintza eta Interpretazioa
Tamayo Masero, Ana
Ros Abaurrea, Alejandro
Filología Inglesa y Alemana y Traducción e Interpretación
Ingeles eta Aleman Filologia eta Itzulpengintza eta Interpretazioa
Tamayo Masero, Ana
Ros Abaurrea, Alejandro
Publication Year :
2024

Abstract

This contribution aims at analysing the speech -to-text recognition of news programmes in the regional channel ETB1 for subtitling in Basque using ADITU (2024) (a technology developed by the Elhuyar foundation) applying the NER model of analysis (RomeroFresco and Martínez 2015). A total of 20 samples of approximately 5 minutes each were recorded from the regional channel ETB1 in May, 2022. A total of 97 minutes and 1737 subtitles were analysed by applying criteria from the NER model. The results show an average accuracy rate of 94.63% if we take all errors into account, and 96.09% if we exclude punctuation errors. A qualitative analysis based on quantitative data foresees some room for improvement regarding language models of the software, punctuation, recognition of proper nouns and speaker identification. From the evidence it may be concluded that, although quantitative data does not reach the threshold to consider the quality of recognition fair or comprehensible with regards to the NER model, results seem promising. When presenters speak with clear diction and standard language, accuracy rates are sufficient for a minority language like Basque in which speech recognition software is still in early phases of development.

Details

Database :
OAIster
Notes :
This work was carried out within the research group TRALIMA/ITZULIK (UPV/EHU, with reference UPV/EHU GIU21/060) and the ALMA research network (RED2018-102475-T). This research was funded by the research project “QUALISUB, The Quality of Live Subtitling: A regional, national and international study” (PID2020-117738RB-I00)., English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1430741873
Document Type :
Electronic Resource