1. ArabRecognizer: modern standard Arabic speech recognition inspired by DeepSpeech2 utilizing Franco-Arabic.
- Author
-
Nasef, Mohammed M., Elshall, Amr A., and Sauber, Amr M.
- Subjects
ARABIC language ,SPEECH perception ,ORAL communication ,ENGLISH language ,NEUROLINGUISTICS - Abstract
Speech recognition is a critical task in spoken language applications. Globally known models such as DeepSpeech2 are effective for English speech recognition, however, they are not well-suited for languages like Arabic. This paper is interested in recognizing the Arabic language, especially Modern Standard Arabic (MSA). This paper proposed two models that utilize "Franco-Arabic" as an encoding mechanism and additional enhancements to recognize MSA. The first model uses Mel-Frequency Cepstral Coefficients (MFCCs) as input features, while the second employs six sequential Gated Recurrent Unit (GRU) layers. Each model is then followed by a fully connected layer with a dropout layer which helped reduce overfitting. The Connectionist Temporal Classification (CTC) loss is used to calculate the prediction error and to maximize the correct transcription likelihood. Two experiments were conducted for each model. The first experiment involved 41 h of continuous speech over 15 epochs. Whereas, the second one utilized 69 h over 30 epochs. The experiments showed that the first model excels in speed while the second excels in accuracy, and both outperformed the well-known DeepSpeech2. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF