Start Over

Foreground-Background Ambient Sound Scene Separation

Authors :: Michel Olvera
Romain Serizel
Emmanuel Vincent
Gilles Gasso
Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes (LITIS)
Université Le Havre Normandie (ULH)
Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN)
Normandie Université (NU)-Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie)
Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)
This work was made with the support of the French National Research Agency, in the framework of the project LEAUDS 'Learning to understandaudio scenes' (ANR-18-CE23-0020). Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientificinterest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).
GRID5000
ANR-18-CE23-0020,LEAUDS,Apprentissage statistique pour la compréhension de scènes audio(2018)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie)
Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Université de Rouen Normandie (UNIROUEN)
Normandie Université (NU)-Université Le Havre Normandie (ULH)
Normandie Université (NU)
ANR-18-CE23-0020,LEAUDS,LEARNING TO UNDERSTAND AUDIO SCENES(2018)
Source :: EUSIPCO 2020-28th European Signal Processing Conference, EUSIPCO 2020-28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287436⟩, EUSIPCO 2020-28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands, EUSIPCO
Publication Year :: 2020
Abstract: International audience; Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background. We consider the task of separating these events from the background, which we call foreground-background ambient sound scene separation. We propose a deep learning-based separation framework with a suitable feature normaliza-tion scheme and an optional auxiliary network capturing the background statistics, and we investigate its ability to handle the great variety of sound classes encountered in ambient sound scenes, which have often not been seen in training. To do so, we create single-channel foreground-background mixtures using isolated sounds from the DESED and Audioset datasets, and we conduct extensive experiments with mixtures of seen or unseen sound classes at various signal-to-noise ratios. Our experimental findings demonstrate the generalization ability of the proposed approach.

Subjects :: Normalization (statistics)
Signal Processing (eess.SP)
FOS: Computer and information sciences
Sound (cs.SD)
Computer Science - Machine Learning
Computer science
Generalization
Ambient noise level
02 engineering and technology
Computer Science - Sound
Machine Learning (cs.LG)
Signal-to-noise ratio
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Audio and Speech Processing (eess.AS)
0202 electrical engineering, electronic engineering, information engineering
ambient sound scenes
FOS: Electrical engineering, electronic engineering, information engineering
Foreground-background
Computer vision
generalization ability
Electrical Engineering and Systems Science - Signal Processing
Sound (geography)
Signal processing
geography
geography.geographical_feature_category
business.industry
Deep learning
deep learning
audio source separation
020206 networking & telecommunications
Feature (computer vision)
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]
020201 artificial intelligence & image processing
Artificial intelligence
business
Electrical Engineering and Systems Science - Audio and Speech Processing

Details

Language :: English
Database :: OpenAIRE
Journal :: EUSIPCO 2020-28th European Signal Processing Conference, EUSIPCO 2020-28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287436⟩, EUSIPCO 2020-28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands, EUSIPCO
Accession number :: edsair.doi.dedup.....983935e6623da811ba9e937e331b90cf

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Foreground-Background Ambient Sound Scene Separation

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Foreground-Background Ambient Sound Scene Separation

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources