Start Over

Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications

Authors :: Fathi E. Abd El-Samie
Amira Shafik
Abdullah M. Iliyasu
Ashraf A. M. Khalaf
El-Sayed M. El-Rabaie
Oh-Young Song
Basma Abd El-Rahiem
Ghada M. El Banby
Ahmed Sedik
Source :: Applied Acoustics. 177:107665
Publication Year :: 2021
Publisher :: Elsevier BV, 2021.
Abstract: Both automatic speaker identification (ASI) and speech recognition can be utlized now for the control of modern robots. An ASI algorithm can be implemented at a speech interface of the robot to determine the identity of the person allowed to deal with the robot, while speech recognition can be implemented for the interpretation of the order given to the robot. Robustness of the ASI system is a challenging task in the presence of speech degradations such as noise and interference. This study presents a new approach to improve the accuracy of speaker identification in the presence of interference for robot control applications with a convolutional neural network (CNN). First, the speech signal from the speaker is divided into segments, each of which is transformed into a spectrogram, and hence Radon transformation is estimated for this spectrogram. The spectrogram resolves the speech segment into a map of power distribution with both time and frequency. Together, the spectrograms and their Radon transforms are used as inputs to a proposed CNN-based deep learning model. Necessary refinements are undertaken and the resulting optimized “Radon-Deep-Learning Model (RDLM) is compared with a benchmark model. The proposed model consists of six convolutional (CNV) layers followed by six Max. pooling layers, while the benchmark model consists of three CNV layers followed by three Max. pooling layers. Experimental results reveal that the proposed RDLM model achieves a high classification accuracy up to 97.5%, which is more than double the performance reported for some traditional methods that are used for speaker identification.

Subjects :: 010302 applied physics
Acoustics and Ultrasonics
Radon transform
Computer science
business.industry
Speech recognition
Deep learning
01 natural sciences
Convolutional neural network
Robot control
Robustness (computer science)
0103 physical sciences
Benchmark (computing)
Spectrogram
Artificial intelligence
Noise (video)
business
010301 acoustics

Details

ISSN :: 0003682X
Volume :: 177
Database :: OpenAIRE
Journal :: Applied Acoustics
Accession number :: edsair.doi...........40f1596c73e04b32a6f91c3ba481ad7f
Full Text :: https://doi.org/10.1016/j.apacoust.2020.107665

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources