1. Audio Retrieval By Voice Imitation
- Author
-
Mohamad Khateeb, Hadas Benisty, and Samah Khawaled
- Subjects
Majority rule ,Audio signal ,Computer science ,Speech recognition ,Mixture model ,01 natural sciences ,Support vector machine ,03 medical and health sciences ,0302 clinical medicine ,Ranking ,Keyword spotting ,Histogram ,0103 physical sciences ,010301 acoustics ,Classifier (UML) ,030217 neurology & neurosurgery - Abstract
Existing sound retrieval systems are mostly based on a textual query. Using text to describe a sound signal is not intuitive and is often inaccurate due to subjective impression of the user; different people may use different words to describe the same sound which makes theses system complex to design and unintuitive to use. Vocal imitation, however, is the most natural human way to describe a sound. In this paper we consider a newly rising approach for sound retrieval based on vocal imitations, where the user records himself imitating the desired sound, and the system retrieves a ranked list of the most similar sounds in the dataset. In this work we represent sound signals using histograms, obtained with respect to a Gaussian Mixture Model (GMM), representing the spectral domain. This recently proposed approach was successfully applied for word representation in a keyword spotting task. Having a fixed length representation for vocal imitation signals allows us to train a robust classifier using support vector machine (SVM). Given a test imitation signal, we apply the classifier and use the output score to rank the retrieved signals, based on a majority vote. Our simulation results show that the proposed system yields a more accurate ranking compared with other existing solutions.
- Published
- 2018
- Full Text
- View/download PDF