Back to Search Start Over

Learning to detect dysarthria from raw speech

Authors :
Neil Zeghidour
Juliette Millet
Laboratoire de Linguistique Formelle (LLF UMR7110)
Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7)
Apprentissage machine et développement cognitif (CoML)
Laboratoire de sciences cognitives et psycholinguistique (LSCP)
Département d'Etudes Cognitives - ENS Paris (DEC)
École normale supérieure - Paris (ENS Paris)-École normale supérieure - Paris (ENS Paris)-Centre National de la Recherche Scientifique (CNRS)-École des hautes études en sciences sociales (EHESS)-Département d'Etudes Cognitives - ENS Paris (DEC)
École normale supérieure - Paris (ENS Paris)-École normale supérieure - Paris (ENS Paris)-Centre National de la Recherche Scientifique (CNRS)-École des hautes études en sciences sociales (EHESS)-Inria de Paris
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Facebook AI Research [Paris] (FAIR)
Facebook
Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS)
École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Département d'Etudes Cognitives - ENS Paris (DEC)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris
Inria de Paris
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de sciences cognitives et psycholinguistique (LSCP)
École normale supérieure - Paris (ENS Paris)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)
Source :
ICASSP, ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, Brighton, United Kingdom
Publication Year :
2019
Publisher :
HAL CCSD, 2019.

Abstract

Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-level features, by selecting the relevant information for the task at hand. We explore an alternative to this selection, by learning jointly the classifier, and the feature extraction. Recent work on speech recognition has shown improved performance over speech features by learning from the waveform. We extend this approach to paralinguistic classification and propose a neural network that can learn a filterbank, a normalization factor and a compression power from the raw speech, jointly with the rest of the architecture. We apply this model to dysarthria detection from sentence-level audio recordings. Starting from a strong attention-based baseline on which mel-filterbanks outperform standard low-level descriptors, we show that learning the filters or the normalization and compression improves over fixed features by 10% absolute accuracy. We also observe a gain over OpenSmile features by learning jointly the feature extraction, the normalization, and the compression factor with the architecture. This constitutes a first attempt at learning jointly all these operations from raw audio for a speech classification task.<br />5 pages, 3 figures, submitted to ICASSP

Details

Language :
English
Database :
OpenAIRE
Journal :
ICASSP, ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP-2019-IEEE International Conference on Acoustics, Speech and Signal Processing, May 2019, Brighton, United Kingdom
Accession number :
edsair.doi.dedup.....152814676bf1c3d8df1680ab7b6d010e