Back to Search
Start Over
Explaining Deep Learning Models for Speech Enhancement
- Source :
- INTERSPEECH 2021, INTERSPEECH 2021, Aug 2021, Brno, Czech Republic, INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-1764⟩
- Publication Year :
- 2021
- Publisher :
- ISCA, 2021.
-
Abstract
- International audience; We consider the problem of explaining the robustness of neural networks used to compute time-frequency masks for speech enhancement to mismatched noise conditions. We employ the Deep SHapley Additive exPlanations (DeepSHAP) feature attribution method to quantify the contribution of every timefrequency bin in the input noisy speech signal to every timefrequency bin in the output time-frequency mask. We define an objective metric-referred to as the speech relevance scorethat summarizes the obtained SHAP values and show that it correlates with the enhancement performance, as measured by the word error rate on the CHiME-4 real evaluation dataset. We use the speech relevance score to explain the generalization ability of three speech enhancement models trained using synthetically generated speech-shaped noise, noise from a professional sound effects library, or real CHiME-4 noise. To the best of our knowledge, this is the first study on neural network explainability in the context of speech enhancement.
- Subjects :
- Artificial neural network
business.industry
Computer science
Deep learning
Speech recognition
Word error rate
020206 networking & telecommunications
Context (language use)
02 engineering and technology
explainable AI
Speech enhancement
030507 speech-language pathology & audiology
03 medical and health sciences
Noise
feature attribution
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Robustness (computer science)
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]
0202 electrical engineering, electronic engineering, information engineering
Feature (machine learning)
speech enhancement
Artificial intelligence
0305 other medical science
business
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Interspeech 2021
- Accession number :
- edsair.doi.dedup.....9f0e8c6b342f8c39c39ebff8ed0ca17e
- Full Text :
- https://doi.org/10.21437/interspeech.2021-1764