Efficient model-based speech separation and denoising using non-negative subspace analysis

Authors :: Peder A. Olsen
John R. Hershey
Steven J. Rennie
Source :: ICASSP
Publication Year :: 2008
Publisher :: IEEE, 2008.
Abstract: We present a new probabilistic architecture for analyzing composite non-negative data, called Non-negative Subspace Analysis (NSA). The NSA model provides a framework for understanding the relationships between sparse subspace and mixture model based approaches, and encompasses a range of models, including Sparse Non-negative Matrix Factorization (SNMF) [1] and mixture-model based analysis as special cases. We present a convenient instantiation of the NSA model, and an efficient variational approximate learning and inference algorithm that combines the advantages of SNMF and mixture model-based approaches. Preliminary recognition results on the Pascal Speech Separation Challenge 2006 test set [2], based on NSA separation results, are presented. The results fall short of those achieved by Algonquin [3], a state-of-the-art mixture-model based method, but considering that NSA runs an order of magnitude faster, the results are impressive. NSA outperforms SNMF in terms of word error rate (WER) on the task by a significant margin of over 9% absolute.

Subjects :: business.industry
Test set
Source separation
Probabilistic logic
Word error rate
Pattern recognition
Artificial intelligence
Speech processing
business
Mixture model
Subspace topology
Mathematics
Matrix decomposition

ISSN :: 15206149
Database :: OpenAIRE
Journal :: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing
Accession number :: edsair.doi...........1350fcd342327396ad3d77104efbefff
Full Text :: https://doi.org/10.1109/icassp.2008.4517989

Full Text Access

Tools