Back to Search
Start Over
Do self-supervised speech models develop human-like perception biases?
- Source :
- Proceedings ACL 2022, ACL 2022-60th Annual Meeting of the Association for Computational Linguistics, ACL 2022-60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland. pp.7591-7605, ⟨10.18653/v1/2022.acl-long.523⟩
- Publication Year :
- 2022
- Publisher :
- HAL CCSD, 2022.
-
Abstract
- International audience; Self-supervised models for speech processing form representational spaces without using any external labels. Increasingly, they appear to be a feasible way of at least partially eliminating costly manual annotations, a problem of particular concern for low-resource languages. But what kind of representational spaces do these models construct? Human perception specializes to the sounds of listeners' native languages. Does the same thing happen in self-supervised models? We examine the representational spaces of three kinds of stateof-the-art self-supervised models: wav2vec 2.0, HuBERT and contrastive predictive coding (CPC), and compare them with the perceptual spaces of French-speaking and Englishspeaking human listeners, both globally and taking account of the behavioural differences between the two language groups. We show that the CPC model shows a small native language effect, but that wav2vec 2.0 and Hu-BERT seem to develop a universal speech perception space which is not language specific. A comparison against the predictions of supervised phone recognisers suggests that all three self-supervised models capture relatively finegrained perceptual phenomena, while supervised models are better at capturing coarser, phone-level, effects of listeners' native language, on perception.
- Subjects :
- FOS: Computer and information sciences
Sound (cs.SD)
Computer Science - Computation and Language
Audio and Speech Processing (eess.AS)
FOS: Electrical engineering, electronic engineering, information engineering
[SCCO.LING]Cognitive science/Linguistics
Computation and Language (cs.CL)
Computer Science - Sound
Electrical Engineering and Systems Science - Audio and Speech Processing
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Proceedings ACL 2022, ACL 2022-60th Annual Meeting of the Association for Computational Linguistics, ACL 2022-60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland. pp.7591-7605, ⟨10.18653/v1/2022.acl-long.523⟩
- Accession number :
- edsair.doi.dedup.....dbe36c6fb7a3706d766e8572ac371282