Descriptor: "synthetic soundscapes" / Publisher: hal ccsd - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"synthetic soundscapes"' showing total 4 results

Start Over Descriptor "synthetic soundscapes" Publisher hal ccsd

4 results on '"synthetic soundscapes"'

1. The impact of non-target events in synthetic soundscapes for sound event detection

Author: Ronchini, Francesca, Serizel, Romain, Turpault, Nicolas, Cornell, Samuele, Ronchini, Francesca, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Departement of Electrical Engineering-SCD [Leuven] (ESAT-SCD), Catholic University of Leuven - Katholieke Universiteit Leuven (KU Leuven), Università Politecnica delle Marche [Ancona] (UNIVPM), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), open-source datasets, deep learning, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], Computer Science - Sound, [INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD], Machine Learning (cs.LG), Sound event detection, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Audio and Speech Processing (eess.AS), [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD], FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Audio and Speech Processing, synthetic soundscapes
Abstract: International audience; Detection and Classification Acoustic Scene and Events Challenge 2021 Task 4 uses a heterogeneous dataset that includes both recorded and synthetic soundscapes. Until recently only target sound events were considered when synthesizing the soundscapes. However, recorded soundscapes often contain a substantial amount of non-target events that may affect the performance. In this paper, we focus on the impact of these non-target events in the synthetic soundscapes. Firstly, we investigate to what extent using non-target events alternatively during the training or validation phase (or none of them) helps the system to correctly detect target events. Secondly, we analyze to what extend adjusting the signal-to-noise ratio between target and non-target events at training improves the sound event detection performance. The results show that using both target and non-target events for only one of the phases (validation or training) helps the system to properly detect sound events, outperforming the baseline (which uses non-target events in both phases). The paper also reports the results of a preliminary study on evaluating the system on clips that contain only non-target events. This opens questions for future work on non-target subset and acoustic similarity between target and non-target events which might confuse the system.
Published: 2021

2. Improving Sound Event Detection In Domestic Environments Using Sound Separation

Author: Turpault, Nicolas, Wisdom, Scott, Erdogan, Hakan, Hershey, John, Serizel, Romain, Fonseca, Eduardo, Seetharaman, Prem, Salamon, Justin, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Google Inc., Music Technology Group (MTG), Universitat Pompeu Fabra [Barcelona] (UPF), Northwestern University [Evanston], Adobe Research, We would like to thank the other organizers of DCASE 2020 task 4: Daniel P. W. Ellis and Ankit Parag Shah., Grid'5000, ANR-18-CE23-0020,LEAUDS,Apprentissage statistique pour la compréhension de scènes audio(2018), Universitat Pompeu Fabra [Barcelona], and ANR-18-CE23-0020,LEAUDS,LEARNING TO UNDERSTAND AUDIO SCENES(2018)
Subjects: Signal Processing (eess.SP), FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Sound, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Synthetic soundscapes, Sound event detection, Audio and Speech Processing (eess.AS), TheoryofComputation_LOGICSANDMEANINGSOFPROGRAMS, [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD], FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Signal Processing, Sound separation, Index Terms-Sound event detection, [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: International audience; Performing sound event detection on real-world recordings often implies dealing with overlapping target sound events and non-target sounds, also referred to as interference or noise. Until now these problems were mainly tackled at the classifier level. We propose to use sound separation as a pre-processing for sound event detection. In this paper we start from a sound separation model trained on the Free Universal Sound Separation dataset and the DCASE 2020 task 4 sound event detection baseline. We explore different methods to combine separated sound sources and the original mixture within the sound event detection. Furthermore, we investigate the impact of adapting the sound separation model to the sound event detection data on both the sound separation and the sound event detection.
Published: 2020

3. Training Sound Event Detection On A Heterogeneous Dataset

Author: Turpault, Nicolas, Serizel, Romain, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), ANR-18-CE23-0020,LEAUDS,Apprentissage statistique pour la compréhension de scènes audio(2018), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Subjects: Signal Processing (eess.SP), FOS: Computer and information sciences, semi-supervised learning, Sound (cs.SD), Computer Science - Sound, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], ablation study, Sound event detection, Audio and Speech Processing (eess.AS), weakly labeled data, [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD], FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Signal Processing, Index Terms-Sound event detection, [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing, Electrical Engineering and Systems Science - Audio and Speech Processing, synthetic soundscapes
Abstract: International audience; Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.
Published: 2020

4. Sound Event Detection and Separation: a Benchmark on Desed Synthetic Soundscapes

Author: Romain Serizel, Hakan Erdogan, Justin Salamon, Nicolas Turpault, John R. Hershey, Scott Wisdom, Eduardo Fonseca, Prem Seetharaman, Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Google Inc, Research at Google, Universitat Pompeu Fabra [Barcelona] (UPF), Descript, Inc., Adobe Research, Part of this work was made with the support of the French National Research Agency, in the framework of the project LEAUDS 'Learning to understand audio scenes' (ANR-18-CE23-0020) and the French region Grand-Est. High Performance Computing resources were partially provided by the EXPLOR centre hosted by the University de Lorraine., Grid'5000, ANR-18-CE23-0020,LEAUDS,Apprentissage statistique pour la compréhension de scènes audio(2018), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Subjects: FOS: Computer and information sciences, Sound localization, Sound (cs.SD), Reverberation, Soundscape, Computer science, Speech recognition, 02 engineering and technology, Computer Science - Sound, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, Audio and Speech Processing (eess.AS), Robustness (computer science), FOS: Electrical engineering, electronic engineering, information engineering, 0202 electrical engineering, electronic engineering, information engineering, Sound (geography), synthetic soundscapes, geography, Signal processing, geography.geographical_feature_category, Event (computing), Sound event detection, [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD], Benchmark (computing), sound separation, 020201 artificial intelligence & image processing, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: International audience; We propose a benchmark of state-of-the-art sound event detection systems (SED). We designed synthetic evaluation sets to focus on specific sound event detection challenges. We analyze the performance of the submissions to DCASE 2021 task 4 depending on time related modifications (time position of an event and length of clips) and we study the impact of non-target sound events and reverberation. We show that the localization in time of sound events is still a problem for SED systems. We also show that reverberation and non-target sound events are severely degrading the performance of the SED systems. In the latter case, sound separation seems like a promising solution.
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"synthetic soundscapes"'

1. The impact of non-target events in synthetic soundscapes for sound event detection

2. Improving Sound Event Detection In Domestic Environments Using Sound Separation

3. Training Sound Event Detection On A Heterogeneous Dataset

4. Sound Event Detection and Separation: a Benchmark on Desed Synthetic Soundscapes

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

4 results on '"synthetic soundscapes"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources