Back to Search
Start Over
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
- Source :
- Interspeech 2022-23rd INTERSPEECH Conference, Interspeech 2022-23rd INTERSPEECH Conference, Sep 2022, Incheon, South Korea
- Publication Year :
- 2022
- Publisher :
- HAL CCSD, 2022.
-
Abstract
- International audience; We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search. We show that when built on top of recent self-supervised audio representations [1, 2, 3], this method can be applied iteratively and yield competitive SSE as evaluated on two tasks: query-by-example of random sequences of speech, and spoken term discovery. On both tasks our method pushes the state-of-the-art by a significant margin across 5 different languages. Finally, we establish a benchmark on a query-byexample task on the LibriSpeech dataset to monitor future improvements in the field.
- Subjects :
- FOS: Computer and information sciences
Artificial Intelligence (cs.AI)
Data augmentation
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]
Computer Science - Artificial Intelligence
k-nearest neighbors
[SCCO.COMP]Cognitive science/Computer science
[SCCO.LING]Cognitive science/Linguistics
Unsupervised speech sequence embeddings
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Interspeech 2022-23rd INTERSPEECH Conference, Interspeech 2022-23rd INTERSPEECH Conference, Sep 2022, Incheon, South Korea
- Accession number :
- edsair.doi.dedup.....3260337894b9f00327e592d347f92b52