1. Machine learning and explainable artificial intelligence for the prevention of waterborne cryptosporidiosis and giardiosis.
- Author
-
Ligda P, Mittas N, Kyzas GZ, Claerebout E, and Sotiraki S
- Subjects
- Humans, Oocysts, Waterborne Diseases prevention & control, Machine Learning, Cryptosporidiosis prevention & control, Cryptosporidiosis epidemiology, Cryptosporidium isolation & purification, Artificial Intelligence, Giardia isolation & purification, Giardiasis prevention & control, Giardiasis epidemiology
- Abstract
Cryptosporidium and Giardia are important parasitic protozoa due to their zoonotic potential and impact on human health, and have often caused waterborne outbreaks of disease. Detection of (oo)cysts in water matrices is challenging and extremely costly, thus only few countries have legislated for regular monitoring of drinking water for their presence. Several attempts have been made trying to investigate the association between the presence of such (oo)cysts in waters with other biotic or abiotic factors, with inconclusive findings. In this regard, the aim of this study was the development of an holistic approach leveraging Machine Learning (ML) and eXplainable Artificial Intelligence (XAI) techniques, in order to provide empirical evidence related to the presence and prediction of Cryptosporidium oocysts and Giardia cysts in water samples. To meet this objective, we initially modelled the complex relationship between Cryptosporidium and Giardia (oo)cysts and a set of parasitological, microbiological, physicochemical and meteorological parameters via a model-agnostic meta-learner algorithm that provides flexibility regarding the selection of the ML model executing the fitting task. Based on this generic approach, a set of four well-known ML candidates were, empirically, evaluated in terms of their predictive capabilities. Then, the best-performed algorithms, were further examined through XAI techniques for gaining meaningful insights related to the explainability and interpretability of the derived solutions. The findings reveal that the Random Forest achieves the highest prediction performance when the objective is the prediction of both contamination and contamination intensity with Cryptosporidium oocysts in a given water sample, with meteorological/physicochemical and microbiological markers being informative, respectively. For the prediction of contamination with Giardia, the eXtreme Gradient Boosting with physicochemical parameters was the most efficient algorithm, while, the Support Vector Regression that takes into consideration both microbiological and meteorological markers was more efficient for evaluating the contamination intensity with cysts. The results of the study designate that the adoption of ML and XAI approaches can be considered as a valuable tool for unveiling the complicated correlation of the presence and contamination intensity with these zoonotic parasites that could constitute, in turn, a basis for the development of monitoring platforms and early warning systems for the prevention of waterborne disease outbreaks., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024. Published by Elsevier Ltd.)
- Published
- 2024
- Full Text
- View/download PDF