Back to Search
Start Over
A review on speaker diarization systems and approaches
- Source :
- Speech Communication. 54:1065-1103
- Publication Year :
- 2012
- Publisher :
- Elsevier BV, 2012.
-
Abstract
- Speaker indexing or diarization is an important task in audio processing and retrieval. Speaker diarization is the process of labeling a speech signal with labels corresponding to the identity of speakers. This paper includes a comprehensive review on the evolution of the technology and different approaches in speaker indexing and tries to offer a fully detailed discussion on these approaches and their contributions. This paper reviews the most common features for speaker diarization in addition to the most important approaches for speech activity detection (SAD) in diarization frameworks. Two main tasks of speaker indexing are speaker segmentation and speaker clustering. This paper includes a separate review on the approaches proposed for these subtasks. However, speaker diarization systems which combine the two tasks in a unified framework are also introduced in this paper. Another discussion concerns the approaches for online speaker indexing which has fundamental differences with traditional offline approaches. Other parts of this paper include an introduction on the most common performance measures and evaluation datasets. To conclude this paper, a complete framework for speaker indexing is proposed, which is aimed to be domain independent and parameter free and applicable for both online and offline applications.
- Subjects :
- Online and offline
Linguistics and Language
Voice activity detection
Process (engineering)
Computer science
Communication
Speech recognition
Search engine indexing
Speaker recognition
computer.software_genre
Language and Linguistics
Computer Science Applications
Speaker diarisation
Modeling and Simulation
Computer Vision and Pattern Recognition
Cluster analysis
Audio signal processing
computer
Software
Subjects
Details
- ISSN :
- 01676393
- Volume :
- 54
- Database :
- OpenAIRE
- Journal :
- Speech Communication
- Accession number :
- edsair.doi...........49f335eedfc65c91f979ac84e5dbe01c
- Full Text :
- https://doi.org/10.1016/j.specom.2012.05.002