101. DNN-based Approach to Detect and Classify Pathological Voice
- Author
-
Feng-Chuan Lin, Shih-Hau Fang, Chi-Te Wang, Ji-Ying Chen, Yi-Te Hsu, Zong-Ying Chuang, Xiaotong Yu, and Zhezhuang Xu
- Subjects
business.industry ,Computer science ,Speech recognition ,Deep learning ,0206 medical engineering ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,02 engineering and technology ,Artificial intelligence ,Mel-frequency cepstrum ,business ,020601 biomedical engineering - Abstract
We participate in the FEMH 2018 Challenge of a bigdata subproject of the IEEE. The goal of this Challenge is pathological voice detection, and classify the different diseases, including phono trauma, neoplasm and vocal paralysis. Final, this challenge uses sensitivity, specificity and UAR as a result. The database is recorded with 50 normal voice samples and 150 samples of common voice disorders in a tertiary teaching hospital (Far Eastern Memorial Hospital, FEMH). The paper proposes a Deep Neural Networks based (DNN-based) approach in this challenge. Data preprocessing used Mel-Frequency Cepstral Coefficients (MFCCs), which also have emotion specific information. Gradual spectral variations are captured using 13 MFCCs extracted from speech signal. In the disease detection section, we examine the performance among different DNN structures (ie, hidden layers and number of neurons). Then, In the disease classification section, examine the performance among different batch sizes and normalize or no normalize. Finally, the tested DNN structures have the best results at 5 hidden layers and 200 of neurons.
- Published
- 2018
- Full Text
- View/download PDF