Back to Search
Start Over
Supervised Acoustic Embeddings And Their Transferability Across Languages
- Publication Year :
- 2023
- Publisher :
- arXiv, 2023.
-
Abstract
- In speech recognition, it is essential to model the phonetic content of the input signal while discarding irrelevant factors such as speaker variations and noise, which is challenging in low-resource settings. Self-supervised pre-training has been proposed as a way to improve both supervised and unsupervised speech recognition, including frame-level feature representations and Acoustic Word Embeddings (AWE) for variable-length segments. However, self-supervised models alone cannot learn perfect separation of the linguistic content as they are trained to optimize indirect objectives. In this work, we experiment with different pre-trained self-supervised features as input to AWE models and show that they work best within a supervised framework. Models trained on English can be transferred to other languages with no adaptation and outperform self-supervised models trained solely on the target languages.<br />Comment: Presented at ICNLSP 2022
- Subjects :
- FOS: Computer and information sciences
Sound (cs.SD)
Computer Science - Computation and Language
Audio and Speech Processing (eess.AS)
FOS: Electrical engineering, electronic engineering, information engineering
Computation and Language (cs.CL)
Computer Science - Sound
Electrical Engineering and Systems Science - Audio and Speech Processing
Subjects
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....c68524994da2ca464f1c2b7dfd6a16da
- Full Text :
- https://doi.org/10.48550/arxiv.2301.01020