Back to Search Start Over

Novel audio characteristic-dependent feature extraction and data augmentation methods for cough-based respiratory disease classification.

Authors :
Shen J
Zhang X
Lu Y
Ye P
Zhang P
Yan Y
Source :
Computers in biology and medicine [Comput Biol Med] 2024 Sep; Vol. 179, pp. 108843. Date of Electronic Publication: 2024 Jul 18.
Publication Year :
2024

Abstract

Respiratory diseases are one of the major health problems worldwide. Early diagnosis of the disease types is of vital importance. As one of the main symptoms of many respiratory diseases, cough may contain information about different pathological changes in the respiratory system. Therefore, many researchers have used cough sounds to diagnose different diseases through artificial intelligence in recent years. The acoustic features and data augmentation methods commonly used in speech tasks are used to achieve better performance. Although these methods are applicable, previous studies have not considered the characteristics of cough sound signals. In this paper, we designed a cough-based respiratory disease classification system and proposed audio characteristic-dependent feature extraction and data augmentation methods. Firstly, according to the short durations and rapid transition of different cough stages, we proposed maximum overlapping mel-spectrogram to avoid missing inter-frame information caused by traditional framing methods. Secondly, we applied various data augmentation methods to mitigate the problem of limited labeled data. Based on the frequency energy distributions of different diseased cough audios, we proposed a parameter-independent self-energy-based augmentation method to enhance the differences between different frequency bands. Finally, in the model testing stage, we leveraged test-time augmentation to further improve the classification performance by fusing the test results of the original and multiple augmented audios. The proposed methods were validated on the Coswara dataset through stratified four-fold cross-validation. Compared to the baseline model using mel-spectrogram as input, the proposed methods achieved an average absolute performance improvement of 3.33% and 3.10% in macro Area Under the Receiver Operating Characteristic (macro AUC) and Unweighted Average Recall (UAR), respectively. The visualization results through Gradient-weighted Class Activation Mapping (Grad-CAM) showed the contributions of different features to model decisions.<br />Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2024 Elsevier Ltd. All rights reserved.)

Details

Language :
English
ISSN :
1879-0534
Volume :
179
Database :
MEDLINE
Journal :
Computers in biology and medicine
Publication Type :
Academic Journal
Accession number :
39029433
Full Text :
https://doi.org/10.1016/j.compbiomed.2024.108843