1. Improving skin cancer detection by Raman spectroscopy using convolutional neural networks and data augmentation
- Author
-
Jianhua Zhao, Harvey Lui, Sunil Kalia, Tim K. Lee, and Haishan Zeng
- Subjects
Skin cancer detection ,Raman spectroscopy ,convolutional neural networks (CNN) ,artificial intelligence (AI) ,optical diagnosis ,data augmentation ,Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,RC254-282 - Abstract
BackgroundOur previous studies have demonstrated that Raman spectroscopy could be used for skin cancer detection with good sensitivity and specificity. The objective of this study is to determine if skin cancer detection can be further improved by combining deep neural networks and Raman spectroscopy.Patients and methodsRaman spectra of 731 skin lesions were included in this study, containing 340 cancerous and precancerous lesions (melanoma, basal cell carcinoma, squamous cell carcinoma and actinic keratosis) and 391 benign lesions (melanocytic nevus and seborrheic keratosis). One-dimensional convolutional neural networks (1D-CNN) were developed for Raman spectral classification. The stratified samples were divided randomly into training (70%), validation (10%) and test set (20%), and were repeated 56 times using parallel computing. Different data augmentation strategies were implemented for the training dataset, including added random noise, spectral shift, spectral combination and artificially synthesized Raman spectra using one-dimensional generative adversarial networks (1D-GAN). The area under the receiver operating characteristic curve (ROC AUC) was used as a measure of the diagnostic performance. Conventional machine learning approaches, including partial least squares for discriminant analysis (PLS-DA), principal component and linear discriminant analysis (PC-LDA), support vector machine (SVM), and logistic regression (LR) were evaluated for comparison with the same data splitting scheme as the 1D-CNN.ResultsThe ROC AUC of the test dataset based on the original training spectra were 0.886±0.022 (1D-CNN), 0.870±0.028 (PLS-DA), 0.875±0.033 (PC-LDA), 0.864±0.027 (SVM), and 0.525±0.045 (LR), which were improved to 0.909±0.021 (1D-CNN), 0.899±0.022 (PLS-DA), 0.895±0.022 (PC-LDA), 0.901±0.020 (SVM), and 0.897±0.021 (LR) respectively after augmentation of the training dataset (p
- Published
- 2024
- Full Text
- View/download PDF