Back to Search
Start Over
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
- Source :
- ICASSP
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain. The key observation is that, for a mixture of two sources, with their magnitudes accurately estimated and under a geometric constraint, the absolute phase difference between each source and the mixture can be uniquely determined; in addition, the source phases at each time-frequency (T-F) unit can be narrowed down to only two candidates. To pick the right candidate, we propose three algorithms based on iterative phase reconstruction, group delay estimation, and phase-difference sign prediction. State-of-the-art results are obtained on the publicly available wsj0-2mix and 3mix corpus.<br />Comment: 5 pages, in submission to ICASSP-2019
- Subjects :
- FOS: Computer and information sciences
Sound (cs.SD)
Computer Science - Computation and Language
Absolute phase
Perspective (graphical)
Short-time Fourier transform
Computer Science - Sound
Domain (mathematical analysis)
symbols.namesake
Fourier transform
Audio and Speech Processing (eess.AS)
FOS: Electrical engineering, electronic engineering, information engineering
symbols
Trigonometry
Computation and Language (cs.CL)
Algorithm
Electrical Engineering and Systems Science - Audio and Speech Processing
Sign (mathematics)
Group delay and phase delay
Mathematics
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Accession number :
- edsair.doi.dedup.....bcb6b6793fb5fd3329fdb2def44c319f