Back to Search Start Over

Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments

Authors :
Serizel, Romain
Turpault, Nicolas
Eghbal-Zadeh, Hamid
Shah, Ankit Parag
Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Johannes Kepler University Linz [Linz] (JKU)
Language Technologies Institute [Pittsburgh] (LTI)
Carnegie Mellon University [Pittsburgh] (CMU)
Grid'5000
Source :
Workshop on Detection and Classification of Acoustic Scenes and Events, Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2018, Woking, United Kingdom
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

Submitted to DCASE2018 Workshop; International audience; This paper presents DCASE 2018 task 4. The task evaluates systems for the large-scale detection of sound events using weakly labeled data (without time boundaries). The target of the systems is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. Another challenge of the task is to explore the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly labeled training set to improve system performance. The data are Youtube video excerpts from domestic context which have many applications such as ambient assisted living. The domain was chosen due to the scientific challenges (wide variety of sounds, time-localized events.. .) and potential industrial applications .

Details

Language :
English
Database :
OpenAIRE
Journal :
Workshop on Detection and Classification of Acoustic Scenes and Events, Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2018, Woking, United Kingdom
Accession number :
edsair.doi.dedup.....5389fb32e83e14ce7865d221ab7b987c