Gabriel Davis Jones, Symon M Kariuki, Anthony K Ngugi, Angelina Kakooza Mwesige, Honorati Masanja, Seth Owusu-Agyei, Ryan Wagner, J Helen Cross, Josemir W Sander, Charles R Newton, Arjune Sen, Hanna Abban, Patrick Adjei, Ken Ae-Ngibise, Francis Agbokey, Lisa Aissaoui, Albert Akpalu, Bright Akpalu, Sabina Asiamah, Gershim Asiki, Mercy Atieno, Evasius Bauni, Dan Bhwana, Mary Bitta, Christian Bottomley, Martin Chabi, Eddie Chengo, Neerja Chowdhary, Myles Connor, Helen Cross, Mark Collinson, Emmanuel Darkwa, Timothy Denison, Victor Doku, Tarun Dua, Isaac Egesa, Tony Godi, F. Xavier Gómez-Olivé, Simone Grassi, Samuel Iddi, Daniel Nana Yaw Abankwah Junior, Kathleen Kahn, Angelina Kakooza, Symon Kariuki, Gathoni Kamuyu, Clarah Khalayi, Henrika Kimambo, Immo Kleinschmidt, Thomas Kwasa, Sloan Mahone, Gergana Manolova, Alexander Mathew, William Matuja, David McDaid, Bruno Mmbando, Daniel Mtai Mwanga, Dorcas Muli, Victor Mung'ala Odera, Frederick Murunga Wekesah, Vivian Mushi, Anthony Ngugi, Peter Odermatt, Rachael Odhiambo, James O Mageto, Peter Otieno, George Pariyo, Stefan Peterson, Josemir Sander, Cynthia Sottie, Isolide Sylvester, Stephen Tollman, Yvonne Thoya, Rhian Twine, Sonia Vallentin, Richard Walker, Stella Waruingi, and Group, EPInA Study
Background Identification of convulsive epilepsy in sub-Saharan Africa relies on access to resources that are often unavailable. Infrastructure and resource requirements can further complicate case verification. Using machine-learning techniques, we have developed and tested a region-specific questionnaire panel and predictive model to identify people who have had a convulsive seizure. These findings have been implemented into a free app for health-care workers in Kenya, Uganda, Ghana, Tanzania, and South Africa. Methods In this retrospective case-control study, we used data from the Studies of the Epidemiology of Epilepsy in Demographic Sites in Kenya, Uganda, Ghana, Tanzania, and South Africa. We randomly split these individuals using a 7:3 ratio into a training dataset and a validation dataset. We used information gain and correlation-based feature selection to identify eight binary features to predict convulsive seizures. We then assessed several machine-learning algorithms to create a multivariate prediction model. We validated the best-performing model with the internal dataset and a prospectively collected external-validation dataset. We additionally evaluated a leave-one-site-out model (LOSO), in which the model was trained on data from all sites except one that, in turn, formed the validation dataset. We used these features to develop a questionnaire-based predictive panel that we implemented into a multilingual app (the Epilepsy Diagnostic Companion) for health-care workers in each geographical region. Findings We analysed epilepsy-specific data from 4097 people, of whom 1985 (48·5%) had convulsive epilepsy, and 2112 were controls. From 170 clinical variables, we initially identified 20 candidate predictor features. Eight features were removed, six because of negligible information gain and two following review by a panel of qualified neurologists. Correlation-based feature selection identified eight variables that demonstrated predictive value; all were associated with an increased risk of an epileptic convulsion except one. The logistic regression, support vector, and naive Bayes models performed similarly, outperforming the decision-tree model. We chose the logistic regression model for its interpretability and implementability. The area under the receiver operator curve (AUC) was 0·92 (95% CI 0·91–0·94, sensitivity 85·0%, specificity 93·7%) in the internal-validation dataset and 0·95 (0·92–0·98, sensitivity 97·5%, specificity 82·4%) in the external-validation dataset. Similar results were observed for the LOSO model (AUC 0·94, 0·93–0·96, sensitivity 88·2%, specificity 95·3%). Interpretation On the basis of these findings, we developed the Epilepsy Diagnostic Companion as a predictive model and app offering a validated culture-specific and region-specific solution to confirm the diagnosis of a convulsive epileptic seizure in people with suspected epilepsy. The questionnaire panel is simple and accessible for health-care workers without specialist knowledge to administer. This tool can be iteratively updated and could lead to earlier, more accurate diagnosis of seizures and improve care for people with epilepsy. Funding The Wellcome Trust, the UK National Institute of Health Research, and the Oxford NIHR Biomedical Research Centre.