Back to Search Start Over

Harmonic Structure Features for Robust Speaker Diarization.

Authors :
Yu Zhou
Hongbin Suo
Junfeng Li
Yonghong Yan
Source :
ETRI Journal; Aug2012, Vol. 34 Issue 4, p583-590, 8p
Publication Year :
2012

Abstract

In this paper, we present a new approach for speaker diarization. First, we use the prosodic information calculated on the original speech to resynthesize the new speech data utilizing the spectrum modeling technique. The resynthesized data is modeled with sinusoids based on pitch, vibration amplitude, and phase bias. Then, we use the resynthesized speech data to extract cepstral features and integrate them with the cepstral features from original speech for speaker diarization. At last, we show how the two streams of cepstral features can be combined to improve the robustness of speaker diarization. Experiments carried out on the standardized datasets (the US National Institute of Standards and Technology Rich Transcription 04-S multiple distant microphone conditions) show a significant improvement in diarization error rate compared to the system based on only the feature stream from original speech. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
12256463
Volume :
34
Issue :
4
Database :
Supplemental Index
Journal :
ETRI Journal
Publication Type :
Academic Journal
Accession number :
78310963
Full Text :
https://doi.org/10.4218/etrij.12.0111.0455