Back to Search
Start Over
Towards cross-lingual voice cloning in higher education
- Source :
- RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia, instname
- Publication Year :
- 2021
- Publisher :
- Elsevier, 2021.
-
Abstract
- [EN] The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages. Similarly, text-to-speech technology is experiencing large improvements in terms of quality, flexibility and capabilities. In particular, state-of-the-art systems are now capable of seamlessly dealing with multiple languages and speakers in an integrated manner, thus enabling lecturer¿s voice cloning in languages she/he might not even speak. This work is to report the experience gained on using such systems at the Universitat Politècnica de València (UPV), mainly as a guidance for other educational organizations willing to conduct similar studies. It builds on previous work on the UPV¿s main repository of educational videos, MediaUPV, to produce multilingual subtitles at scale and low cost. Here, a detailed account is given on how this work has been extended to also allow for massive machine dubbing of MediaUPV. This includes collecting 59 h of clean speech data from UPV¿s academic staff, and extending our production pipeline of subtitles with a state-of-the-art multilingual and multi-speaker text-to-speech system trained from the collected data. Our main result comes from an extensive, subjective evaluation of this system by lecturers contributing to data collection. In brief, it is shown that text-to-speech technology is not only mature enough for its application to MediaUPV, but also needed as soon as possible by students to improve its accessibility and bridge language barriers.<br />We wish first to thank all UPV lecturers who made this study possi-ble. We are also very grateful for the funding support received by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon) , the Spanish government under grant RTI2018-094879-B-I00 (Multisub, MCIU/AEI/FEDER) , and the Universitat Politecnica de Valencia's, Spain PAID-01-17 R&D sup-port programme. Funding for open access charge: CRUE-Universitat Politecnica de Valencia
- Subjects :
- 10.- Reducir las desigualdades entre países y dentro de ellos
Machine translation
Higher education
Computer science
media_common.quotation_subject
Language barrier
Cross-lingual voice conversion
BIBLIOTECONOMIA Y DOCUMENTACION
02 engineering and technology
computer.software_genre
Artificial Intelligence
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Quality (business)
Electrical and Electronic Engineering
Educational resources
media_common
Flexibility (engineering)
Data collection
Cloning (programming)
Multimedia
business.industry
4. Education
05 social sciences
050301 education
Cost reduction
Text-to-speech
Control and Systems Engineering
Multilinguality
OER
04.- Garantizar una educación de calidad inclusiva y equitativa, y promover las oportunidades de aprendizaje permanente para todos
business
0503 education
computer
LENGUAJES Y SISTEMAS INFORMATICOS
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- RiuNet. Repositorio Institucional de la Universitat Politécnica de Valéncia, instname
- Accession number :
- edsair.doi.dedup.....48b96e00877a154dd7580c0f745470ad
- Full Text :
- https://doi.org/10.1016/j.engappai.2021.104413