1. Urdu Lip Reading Systems for Digits in Controlled and Uncontrolled Environment
- Author
-
Amanullah Baloch, Mushtaq Ali, Lal Hussain, Touseef Sadiq, and Badr S. Alkahtani
- Subjects
Lip reading system ,Urdu dataset ,deep learning model ,data augmentation ,lips extraction ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Lip reading technology can significantly benefit various domains, such as enhancing communication for the hearing impaired person, assisting in noisy environments, and improving security with silent password inputs. Despite advancements in lip reading for several languages, there has been limited success in developing an effective model for Urdu lip reading due to the lack of an appropriate dataset and the challenges faced by earlier models, such as the unsuccessful adaptation of the LipNet model to Urdu. To address these issues, we contribute by introducing the ULRD dataset, employing diverse data augmentation techniques, and comparing three DNN models: a Hybrid 2D-3D CNN-LSTM model, a LipNet-based 2D CNN-LSTM model, and a baseline 3D CNN-GRU model. Each model is evaluated in both controlled and uncontrolled environments, using both seen and unseen data. Results indicate that the LipNet-based 2D CNN-LSTM model achieves overall 92.15 % high accuracy in all conditions, but the Hybrid model demonstrates impressive generalization with an overall 90.00 % accuracy on unseen data due to its enhanced spatiotemporal feature extraction capability. Additionally, the precision: 0.91, recall: 0.91, and F1-Score: 0.91 results of LipNet-based 2D CNN-LSTM model are also high, then its other competitors models. The other various findings highlight the effectiveness of different DNN architectures and the potential improvements offered by the ULRD dataset for Urdu lip reading research.
- Published
- 2025
- Full Text
- View/download PDF