Author: "Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA)" / Journal: 2022 ieee spoken language technology workshop (slt) - Searchworks@Jio Institute Digital Library Search Results

1. Continual Self-Supervised Domain Adaptation for End-to-End Speaker Diarization

Author: Coria, Juan Manuel, Bredin, Hervé, Ghannay, Sahar, Rosset, Sophie, Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université de Toulouse (UT)-Toulouse Mind & Brain Institut (TMBI), Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT), Centre National de la Recherche Scientifique (CNRS), Université Paris-Saclay under PhD contract number 2019-089, access to the HPC resources of IDRIS under the allocation AD011012177R1 made by GENCI, and IEEE Speech and Language Processing Technical Committee
Subjects: Self-supervised learning, Domain adaptation, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, End-to-end speaker diarization, Continual learning, [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Abstract: International audience; In conventional domain adaptation for speaker diarization, a large collection of annotated conversations from the target domain is required. In this work, we propose a novel continual training scheme for domain adaptation of an end-to-end speaker diarization system, which processes one conversation at a time and benefits from full self-supervision thanks to pseudo-labels. The qualities of our method allow for autonomous adaptation (e.g. of a voice assistant to a new household), while also avoiding permanent storage of possibly sensitive user conversations. We experiment extensively on the 11 domains of the DIHARD III corpus and show the effectiveness of our approach with respect to a pre-trained baseline, achieving a relative 17% performance improvement. We also find that data augmentation and a well-defined target domain are key factors to avoid divergence and to benefit from transfer.
Published: 2023
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA)"'

1. Continual Self-Supervised Domain Adaptation for End-to-End Speaker Diarization

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Équipe Structuration, Analyse et MOdélisation de documents Vidéo et Audio (IRIT-SAMoVA)"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources