Back to Search Start Over

A review on speaker diarization systems and approaches

Authors :
Mohammad Hossein Moattar
Mohammad Mehdi Homayounpour
Source :
Speech Communication. 54:1065-1103
Publication Year :
2012
Publisher :
Elsevier BV, 2012.

Abstract

Speaker indexing or diarization is an important task in audio processing and retrieval. Speaker diarization is the process of labeling a speech signal with labels corresponding to the identity of speakers. This paper includes a comprehensive review on the evolution of the technology and different approaches in speaker indexing and tries to offer a fully detailed discussion on these approaches and their contributions. This paper reviews the most common features for speaker diarization in addition to the most important approaches for speech activity detection (SAD) in diarization frameworks. Two main tasks of speaker indexing are speaker segmentation and speaker clustering. This paper includes a separate review on the approaches proposed for these subtasks. However, speaker diarization systems which combine the two tasks in a unified framework are also introduced in this paper. Another discussion concerns the approaches for online speaker indexing which has fundamental differences with traditional offline approaches. Other parts of this paper include an introduction on the most common performance measures and evaluation datasets. To conclude this paper, a complete framework for speaker indexing is proposed, which is aimed to be domain independent and parameter free and applicable for both online and offline applications.

Details

ISSN :
01676393
Volume :
54
Database :
OpenAIRE
Journal :
Speech Communication
Accession number :
edsair.doi...........49f335eedfc65c91f979ac84e5dbe01c
Full Text :
https://doi.org/10.1016/j.specom.2012.05.002