7,328 results on '"Predictive modelling"'
Search Results
2. A review of predictive modelling and drone remote sensing technologies as a tool for detecting clandestine burials
- Author
-
Koopman, Marissa, Milliet, Quentin, and Champod, Christophe
- Published
- 2025
- Full Text
- View/download PDF
3. How to responsibly deploy a predictive modelling dashboard for study advisors? A use case illustrating various stakeholder perspectives
- Author
-
van Leeuwen, Anouschka, Goudriaan, Marije, and Aksu, Ünal
- Published
- 2024
- Full Text
- View/download PDF
4. Machine learning-based predictive modelling of biodiesel production from animal fats catalysed by a blast furnace slag geopolymer
- Author
-
Mwenge, Pascal and Rutto, Hilary
- Published
- 2025
- Full Text
- View/download PDF
5. Predictive modelling of the effectiveness of vaccines against COVID-19 in Bogotá: Methodological innovation involving different variants and computational optimisation efficiency
- Author
-
Espinosa, Oscar, White, Lisa, Bejarano, Valeria, Aguas, Ricardo, Rincón, Duván, Mora, Laura, Ramos, Antonio, Sanabria, Cristian, Rodríguez, Jhonathan, Barrera, Nicolás, Álvarez-Moreno, Carlos, Cortés, Jorge, Saavedra, Carlos, Robayo, Adriana, Gao, Bo, and Franco, Oscar
- Published
- 2024
- Full Text
- View/download PDF
6. Machine-learning synergy in high-entropy alloys: A review
- Author
-
Elkatatny, Sally, Abd-Elaziem, Walaa, Sebaey, Tamer A., Darwish, Moustafa A., and Hamada, Atef
- Published
- 2024
- Full Text
- View/download PDF
7. Predictive modelling of student dropout risk: Practical insights from a South Korean distance university
- Author
-
Seo, Eui-Yeong, Yang, Jaemo, Lee, Ji-Eun, and So, Geunju
- Published
- 2024
- Full Text
- View/download PDF
8. Microsatellite instability in mismatch repair proficient colorectal cancer: clinical features and underlying molecular mechanisms
- Author
-
Xu, Yun, Liu, Kai, Li, Cong, Li, Minghan, Zhou, Xiaoyan, Sun, Menghong, Zhang, Liying, Wang, Sheng, Liu, Fangqi, and Xu, Ye
- Published
- 2024
- Full Text
- View/download PDF
9. EHR-based prediction modelling meets multimodal deep learning: A systematic review of structured and textual data fusion methods
- Author
-
Teles, Ariel Soares, de Moura, Ivan Rodrigues, Silva, Francisco, Roberts, Angus, and Stahl, Daniel
- Published
- 2025
- Full Text
- View/download PDF
10. What elements of the opening set influence the outcome of a tennis match? An in-depth analysis of Wimbledon data
- Author
-
Gupta, Kapil, Krishnamurthy, Vijayshankar, and Deb, Soudeep
- Published
- 2024
- Full Text
- View/download PDF
11. Life on the edge: A new toolbox for population‐level climate change vulnerability assessments
- Author
-
Barratt, Christopher D, Onstein, Renske E, Pinsky, Malin L, Steinfartz, Sebastian, Kühl, Hjalmar S, Forester, Brenna R, and Razgour, Orly
- Subjects
Climate Change Impacts and Adaptation ,Biological Sciences ,Ecology ,Evolutionary Biology ,Genetics ,Environmental Sciences ,Human Genome ,Biotechnology ,Climate Action ,Life on Land ,adaptation ,circuit theory ,climate change vulnerability assessment ,conservation ,genomics ,global change ,informatics ,predictive modelling ,Environmental Science and Management ,Zoology ,Environmental management - Abstract
Abstract: Global change is impacting biodiversity across all habitats on earth. New selection pressures from changing climatic conditions and other anthropogenic activities are creating heterogeneous ecological and evolutionary responses across many species' geographic ranges. Yet we currently lack standardised and reproducible tools to effectively predict the resulting patterns in species vulnerability to declines or range changes. We developed an informatic toolbox that integrates ecological, environmental and genomic data and analyses (environmental dissimilarity, species distribution models, landscape connectivity, neutral and adaptive genetic diversity, genotype‐environment associations and genomic offset) to estimate population vulnerability. In our toolbox, functions and data structures are coded in a standardised way so that it is applicable to any species or geographic region where appropriate data are available, for example individual or population sampling and genomic datasets (e.g. RAD‐seq, ddRAD‐seq, whole genome sequencing data) representing environmental variation across the species geographic range. To demonstrate multi‐species applicability, we apply our toolbox to three georeferenced genomic datasets for co‐occurring East African spiny reed frogs (Afrixalus fornasini, A. delicatus and A. sylvaticus) to predict their population vulnerability, as well as demonstrating that range loss projections based on adaptive variation can be accurately reproduced from a previous study using data for two European bat species (Myotis escalerai and M. crypticus). Our framework sets the stage for large scale, multi‐species genomic datasets to be leveraged in a novel climate change vulnerability framework to quantify intraspecific differences in genetic diversity, local adaptation, range shifts and population vulnerability based on exposure, sensitivity and landscape barriers.
- Published
- 2024
12. A spatial reconnaissance survey for gold exploration in a schist belt
- Author
-
Tende, Andongma W., Aminu, Mohammed D., Amuda, Abdulgafar K., Gajere, Jiriko N., Usman, Hadiza, and Shinkafi, Fatima
- Published
- 2021
- Full Text
- View/download PDF
13. Characterisation of cardiovascular disease (CVD) incidence and machine learning risk prediction in middle-aged and elderly populations: data from the China health and retirement longitudinal study (CHARLS).
- Author
-
Huang, Qing, Jiang, Zihao, Shi, Bo, Meng, Jiaxu, Shu, Li, Hu, Fuyong, and Mi, Jing
- Subjects
- *
OLDER people , *MACHINE learning , *SLEEP duration , *FEATURE selection , *WAIST circumference - Abstract
Background: Due to the ageing population and evolving lifestyles occurring in China, middle-aged and elderly populations have become high-risk groups for cardiovascular disease (CVD). The aim of this study was to analyse the incidence characteristics of CVD in these populations and develop a prediction model by using data from the China Health and Retirement Longitudinal Study (CHARLS). Methods: We used follow-up data from the CHARLS to analyse CVD incidence in the Chinese middle-aged and elderly population over a time span of 9 years. Five machine learning (ML) algorithms were employed for risk prediction. Data preprocessing included missing value imputation via random forest. Feature selection was performed using the Least Absolute Shrinkage and Selection Operator (Lasso CV) method with cross-validation prior to model training. The application of the synthetic minority over-sampling technique (SMOTE) to address class imbalance. Model performance was evaluated via analyses including the area under the ROC curve (AUC), precision, recall, F1 score, and SHAP plots for interpretability. Results: In accordance with the exclusion criteria, 12,580, 12,061, 11,545, and 11,619 participants were enrolled in four follow-up rounds. The cumulative incidence (CI) of CVD at 2, 4, 7, and 9 years was 2.846%, 8.971%, 17.869% and 20.518%,, respectively. Significant differences in CVD incidence were observed across gender, age, ethnicity, and region, with higher rates observed in females and in the northeast region. Ultimately, 8,080 participants and 24 features were analysed for CVD risk prediction. Five ML models were built based on these features. Although the LGB model achieves an AUC of 0.818, indicating strong overall performance, its F1 score and recall rate are relatively low, at 0.509 and 43.1%, respectively. Shapley additive explanations (SHAP) analyses revealed the importance of key features, such as night sleep duration, TG levels, and waist circumference, in predicting outcomes, and highlighted the nonlinear relationships between these features and CVD risk. Conclusions: Gender, age, ethnicity, and region are significant factors influencing CVD incidence. Although the LGB model demonstrates good overall performance, its low F1 score and recall rate reveal limitations in identifying high-risk cardiovascular disease patients. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
14. Explorative Short-Term Predictive Models for the Belgian (Energy) Renovation Market Incorporating Macroeconomic and Sector-Specific Variables.
- Author
-
Gepts, Bieke, Nuyts, Erik, and Verbeeck, Griet
- Abstract
Retrofitting existing buildings is a cornerstone of Europe's strategy for a sustainable built environment. Therefore, accurate short-term forecasts to evaluate policy impacts and inform future strategies are needed. This study investigates the short-term predictive modelling of renovation activity in Belgium, focusing on overall renovation activity (RA) and energy-specific renovation activity (EA). Using data from 2012 to 2023, linear modelling was employed to analyze the relationships between RA/EA and macroeconomic indicators, market confidence, building permits, and loan data, with model performance evaluated using Mean Absolute Percentage Error (MAPE). Monthly data and time lags of up to 24 months were considered. The three best-performing models for RA achieved MAPE values between 2.9% and 3.1%, with validated errors ranging from 0.1% to 4.1%. For EA, the best models yielded MAPE values between 4.4% and 4.6% and validated errors between 8.9% and 14%. Renovation loans and building permits emerged as strong predictors for RA, while building material prices and loans were more relevant for EA. The time lag analysis highlighted that renovation processes typically span 15–24 months following loan approval. However, the low accuracy observed for EA underscores the need for further refinement. This explorative effort forms a solid base, inviting additional research to enhance our predictive capabilities and improve short-term modelling of the (green) residential renovation market. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
15. Development of an Analytical Model for Predicting the Shear Viscosity of Polypropylene Compounds.
- Author
-
Seifert, Lukas, Leuchtenberger-Engel, Lisa, and Hopmann, Christian
- Subjects
- *
POLYMER blends , *PRODUCT quality , *RAW materials , *MACHINE learning , *VISCOSITY - Abstract
The need for an efficient adaptation of existing polypropylene (PP) formulations or the creation of new formulations has become increasingly important in various industries. Variations in viscosity resulting from changes in raw materials, fillers, and additives can have a significant impact on the processing and quality of PP products. This study presents the development of an analytical model designed to predict the shear viscosity of complex PP blends. By integrating established mixing rules with novel fitting parameters, the model provides a systematic and efficient method for managing variability in PP formulations. Experimental data from binary and multi-component blends were used to validate the model, demonstrating high prediction accuracy over a range of shear rates. The proposed model serves as a valuable tool for compounders and manufacturers to optimise PP formulations and develop new recipes with consistent processing and product quality. Future work will include industrial-scale trials and further evaluation against advanced machine learning approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
16. Study on influencing factors and prediction model of strength and compression index of sandy silt on bank under freeze–thaw cycles.
- Author
-
Yang, Zhen, Mou, Xianyou, Li, Hao, Ji, Honglan, Mao, Yuxin, and Song, Hongze
- Subjects
- *
SHEAR strength of soils , *FROZEN ground , *SOIL cohesion , *SHEAR strength , *INTERNAL friction , *COHESION , *FREEZE-thaw cycles - Abstract
The Inner Mongolia section of the Yellow River is a seasonal frozen soil area, where the freeze–thaw effect can alter soil strength and compressibility, affecting bank stability. This study takes the banks sandy silt of the Inner Mongolia section of the Yellow River as the research object. It systematically investigates the relationship between shear strength parameters and compression index of sandy silt and the initial dry density, water content, and freeze–thaw cycles of the soil. It analyzes the order and significance of influencing factors, establishes prediction models of shear strength and compression index, and evaluates the effects of freeze–thaw cycles on soil cohesion and shear strength. The results show that the shear strength index of sandy silt is proportional to changes in initial dry density and inversely proportional to changes in water content. After 10 freeze–thaw cycles, the cohesion of the soil decreases by 22.53 to 58.85%, and the shear strength decreases by 22.67 to 58.91%. The internal friction angle is less affected by freeze–thaw and tends to be stable overall. The smaller the initial dry density and the greater the water content, the greater the compression index and compressibility of the soil, but freeze–thaw has little effect on compression index. The factors affecting sandy silt shear strength and compression index are ranked as dry density > moisture content > freeze–thaw cycles. The stepwise regression model of soil shear strength and compression index based on initial dry density, water content, and freeze–thaw cycles is effective, providing technical guidance for engineering practice. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
17. Research on Cross-Border e-Commerce Supply Chain Prediction and Optimization Model Based on Convolutional Neural Network Algorithm.
- Author
-
Zhao, Yajie, Gong, Bin, and Huang, Bo
- Subjects
- *
METAHEURISTIC algorithms , *CONVOLUTIONAL neural networks , *CROSS-border e-commerce , *OPTIMIZATION algorithms , *SUPPLY chain management - Abstract
Enhancing the precision of supply chain management and reducing operational costs are crucial for the development of the cross-border e-commerce market. However, existing research often overlooks the demand uncertainty caused by seasonal variations and the challenges of handling returns in logistics. Therefore, this paper proposes a SARIMA-CNN-BiLSTM prediction model that effectively captures both the seasonal and nonlinear characteristics of cross-border e-commerce supply chains. Additionally, by incorporating the returns process, a supply chain distribution optimization model is developed with the objective of minimizing total operational costs. The model is solved using an improved whale optimization algorithm. In validation with real-world data, the SARIMA-CNN-BiLSTM model achieved a mean absolute percentage error reduction of 6.479 and 7.703 compared to convolutional neural network (CNN) and BiLSTM models, respectively. Moreover, the chosen optimization algorithm reduced the cost by 231,310 CNY, 62,564 CNY, and 131,632 CNY compared to the whale optimization algorithm, genetic algorithm, and particle swarm optimization, respectively. The proposed approach provides robust support for cross-border e-commerce enterprises in reducing costs and enhancing efficiency in their operations. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
18. An enhanced predictive modelling framework for highly accurate non-alcoholic fatty liver disease forecasting.
- Author
-
Arora, Nidhi, Srivastava, Shilpa, Tripathi, Aprna, and Gupta, Varuna
- Subjects
MACHINE learning ,NON-alcoholic fatty liver disease ,FEATURE selection ,DATA scrubbing ,RANDOM forest algorithms - Abstract
Non-alcoholic fatty liver disease (NAFLD) is a chronic medical ailment characterized by accumulation of excessive fat in the liver of non-alcoholic patients. In absence of any early visible indications, application of machine learning based predictive techniques for early prediction of NAFLD are quite beneficial. The objective of this paper is to present a complete framework for guided development of varied predictive machine learning models and predict NAFLD disease with high accuracy. The framework employs step-by-step data quality enhancement to medical data such as cleaning, normalization, data upscaling using SMOTE (for handling class imbalances) and correlation analysis-based feature selection to predict NAFLD with high accuracy using only clinically recorded identifiers. Comprehensive comparative analysis of prediction results of seven machine learning predictive models is done using unprocessed as well as quality enhanced data. As per the observed results, XGBoost, random forest and neural network machine learning models reported significantly higher accuracies with improved 'AUC' and 'ROC' values using preprocessed data in contrast to unprocessed data. The prediction results are also assessed on various quality metrics such as 'accuracy', 'f1-score', 'precision', and 'recall' significantly support the need for presented methodologies for qualitative NAFLD prediction modelling. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
19. Transformer–Gate Recurrent Unit-Based Hourly Purified Natural Gas Prediction Algorithm.
- Author
-
Su, Chang, Huang, Jingcai, Dong, Shasha, He, Yuqi, Li, Ji, Hu, Luyao, Liu, Xiao, and Liao, Yong
- Subjects
PREDICTION algorithms ,RECURRENT neural networks ,NATURAL gas ,PROCESS capability ,INDUSTRIAL robots - Abstract
With the rapid development of industrial automation and intelligence, the consumption of resources and the environmental impact of production processes cannot today be ignored. Today, natural gas, as a commonly used energy source, produces significantly lower emissions of carbon dioxide, sulphur dioxide, and nitrogen oxides from combustion than coal and oil, and can be further purified to remove the small amount of impurities it contains, such as sulphur compounds. Therefore, purified natural gas (hereinafter referred to as purified gas), as a clean energy source, plays an important role in realising sustainable development. At the same time, It becomes more and more important to dispatch purified gas resources reasonably and accurately, and the paramount factor is that the load of purified gas needs to be predicted accurately. Therefore, this paper proposes a Transformer–GRU-based hourly prediction model for purified gas. The model uses the Transformer model for data fusion and feature extraction, and then combines the time series processing capability of the Gate Recurrent Unit (GRU) model to capture long-term dependencies and short-term dynamic changes in time series data. In this paper, the purified gas load data of Chongqing Municipality in 2020 was first preprocessed, and then divided into daily and hourly load datasets according to the measurement step. Meanwhile, considering the influence of temperature factor, the experimental dataset is subdivided according to whether it includes temperature data or not, and then the Transformer–GRU model was built for prediction, respectively. The results show that, compared with the Dual-Stage Attention-Based Recurrent Neural Network (DA-RNN) and the Transformer and GRU models alone, the Transformer–GRU model exhibits good performance in terms of the coefficient of determination, the average absolute percentage error, and mean square error, which can well meet the requirement of hourly prediction accuracy and has greater application value. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
20. A bibliometric review of predictive modelling for cervical cancer risk.
- Author
-
Ngema, Francis, Mdhluli, Bonginkosi, Mmileng, Pako, Shungube, Precious, Makgaba, Mokgoropo, and Hossana, Twinomurinzi
- Subjects
NATURAL language processing ,MACHINE learning ,MEDICAL personnel ,CERVICAL cancer ,ARTIFICIAL intelligence - Abstract
Cervical cancer represents a significant public health challenge, particularly affecting women's health globally. This study aims to advance the understanding of cervical cancer risk prediction research through a bibliometric analysis. The study identified 800 records from Scopus and Web of Science databases, which were reduced to 142 unique records after removing duplicates. Out of 100 abstracts assessed, 42 were excluded based on specific criteria, resulting in 58 studies included in the bibliometric review. Multiple scoping methods such as thematic analysis, citation analysis, bibliographic coupling, natural language processing, Latent Dirichlet Allocation and other visualisation techniques were used to analyse related publications between 2013 and 2024. The key findings revealed the importance of interdisciplinary collaboration in cervical cancer risk prediction, integrating expertise from mathematical disciplines, biomedical health, healthcare practitioners, public health, and policy. This approach significantly enhanced the accuracy and efficiency of cervical cancer detection and predictive modelling by adopting advanced machine learning algorithms, such as random forests and support vector machines. The main challenges were the lack of external validation on independent datasets and the need to address model interpretability to ensure healthcare providers understand and trust the predictive models. The study revealed the importance of interdisciplinary collaboration in cervical cancer risk prediction. It made recommendations for future research to focus on increasing the external validation of models, improving model interpretability, and promoting global research collaborations to enhance the comprehensiveness and applicability of cervical cancer risk prediction models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Predictive modelling of metabolic syndrome in Ghanaian diabetic patients: an ensemble machine learning approach.
- Author
-
Acheampong, Emmanuel, Adua, Eric, Obirikorang, Christian, Anto, Enoch Odame, Peprah-Yamoah, Emmanuel, Obirikorang, Yaa, Asamoah, Evans Adu, Opoku-Yamoah, Victor, Nyantakyi, Michael, Taylor, John, Buckman, Tonnies Abeku, Yakubu, Maryam, and Afrifa-Yamoah, Ebenezer
- Subjects
- *
MACHINE learning , *TYPE 2 diabetes , *FEATURE selection , *PLURALITY voting , *SUPPORT vector machines - Abstract
Objectives: The burgeoning prevalence of cardiometabolic disorders, including type 2 diabetes mellitus (T2DM) and metabolic syndrome (MetS) within Africa is concerning. Machine learning (ML) techniques offer a unique opportunity to leverage data-driven insights and construct predictive models for MetS risk, thereby enhancing the implementation of personalised prevention strategies. In this work, we employed ML techniques to develop predictive models for pre-MetS and MetS among diabetic patients. Methods: This multi-centre cross-sectional study comprised of 919 T2DM patients. Age, gender, novel anthropometric indices along with biochemical measures were analysed using BORUTA feature selection and an ensemble majority voting classification model, which included logistic regression, k-nearest neighbour, Gaussian Naive Bayes, Gradient boosting classification, and support vector machine. Results: Distinct metabolic profiles and phenotype clusters were associated with MetS progression. The BORUTA algorithm identified 10 and 16 significant features for pre-MetS and MetS prediction, respectively. For pre-MetS, the top-ranked features were lipid accumulation product (LAP), triglyceride-glucose index adjusted for waist-to-height ratio (TyG-WHtR), coronary risk (CR), visceral adiposity index (VAI) and abdominal volume index (AVI). For MetS prediction, the most influential features were VAI, LAP, waist triglyceride index (WTI), Very low-density cholesterol (VLDLC) and TyG-WHtR. Majority voting ensemble classifier demonstrated superior performance in predicting pre-MetS (AUC = 0.79) and MetS (AUC = 0.87). Conclusion: Identifying these risk factors reveals the complex interplay between visceral adiposity and metabolic dysregulation in African populations, enabling early detection and treatment. Ethical integration of ML algorithms in clinical decision-making can streamline identification of high-risk individuals, optimize resource allocation, and enable precise, tailored interventions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Predictive modelling of HHV and LHV for sugar industry by-products: a study on sugarcane trash leaves, bagasse, and filter cake.
- Author
-
Sanchumpu, Pasawat, Suaili, Wiriya, Nonsawang, Siwakorn, Ansuree, Peeranat, and Laloon, Kittipong
- Subjects
COMBUSTION efficiency ,RAW materials ,RENEWABLE energy sources ,SUGAR industry ,WASTE management - Abstract
This study aimed to enhance the efficiency of by-product raw materials from the sugar industry for use as fuel. The approach involved developing an equation to calculate the higher heating value (HHV) for each type of raw material using a regression method. Additionally, a simplex-centroid mixture design (SCMD) was employed to estimate the lower heating value (LHV) based on the mixing ratios by weight of sugarcane trash leaves (SCL), sugarcane bagasse (SCB), and filter cake (FTC). The results demonstrated that the developed model accurately estimated the HHV for each raw material. The ultimate analysis showed high statistical appropriateness, with an R
2 of 0.83. The standard error of estimation was 0.74 MJ/kg, and the mean absolute error was 0.76%. Furthermore, the SCMD effectively estimated the LHV of the SCL, SCB, and FTC mixture ratios, achieving an R2 of 99.77%. The evaluation and validation of the prediction equation revealed a mean absolute error of 7.57% and a mean bias error of 6.31%. The findings of this study can be used to enhance the combustion efficiency of sugar industry by-products for use as fuel by selecting the optimal mixing ratio for each type of raw material. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
23. Predictive modelling of HHV and LHV for sugar industry by-products: a study on sugarcane trash leaves, bagasse, and filter cake
- Author
-
Pasawat Sanchumpu, Wiriya Suaili, Siwakorn Nonsawang, Peeranat Ansuree, and Kittipong Laloon
- Subjects
Predictive modelling ,mixture ratios ,renewable energy ,waste management ,simplex-centroid mixture design ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
This study aimed to enhance the efficiency of by-product raw materials from the sugar industry for use as fuel. The approach involved developing an equation to calculate the higher heating value (HHV) for each type of raw material using a regression method. Additionally, a simplex-centroid mixture design (SCMD) was employed to estimate the lower heating value (LHV) based on the mixing ratios by weight of sugarcane trash leaves (SCL), sugarcane bagasse (SCB), and filter cake (FTC). The results demonstrated that the developed model accurately estimated the HHV for each raw material. The ultimate analysis showed high statistical appropriateness, with an R2 of 0.83. The standard error of estimation was 0.74 MJ/kg, and the mean absolute error was 0.76%. Furthermore, the SCMD effectively estimated the LHV of the SCL, SCB, and FTC mixture ratios, achieving an R2 of 99.77%. The evaluation and validation of the prediction equation revealed a mean absolute error of 7.57% and a mean bias error of 6.31%. The findings of this study can be used to enhance the combustion efficiency of sugar industry by-products for use as fuel by selecting the optimal mixing ratio for each type of raw material.
- Published
- 2024
- Full Text
- View/download PDF
24. Nomogram for predicting cervical lymph node metastasis of papillary thyroid carcinoma using deep learning-based super-resolution ultrasound image
- Author
-
Xia Li, Yu Zhao, Wenhui Chen, Xu Huang, Yan Ding, Shuangyi Cao, Chujun Wang, and Chunquan Zhang
- Subjects
Papillary thyroid carcinoma ,Cervical lymph node metastases ,Deep learning ,Super-resolution reconstruction ,Predictive modelling ,Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,RC254-282 - Abstract
Abstract Objectives To investigate the feasibility and effectiveness of a deep learning (DL) super-resolution (SR) ultrasound image reconstruction model for predicting cervical lymph node status in patients with papillary thyroid carcinoma(PTC). Methods In this retrospective study, researchers recruited 544 patients with PTC and randomly assigned them to training and test sets. SR ultrasound images were acquired using SR technology to improve image resolution, and artificial features and DL features were extracted from the original (OR) and SR images, respectively, to construct a ML, DL model. The best model was selected and aggregated with clinical parameters to construct the nomogram. The performance of the model is evaluated by ROC curves, calibration curves and decision curves. Results In distinguishing the presence or absence of metastatic lymph nodes, the predictive performance of the SR_ResNet 101 and SR_SVM models based on SR outperformed those based on OR. In the test set, SR_SVM AUC was 0.878 (95% CI 0.8203–0.9358), accuracy 0.854, while OR_SVM AUC was 0.822 (95% CI 0.7500–0.8937), accuracy 0.665. SR_ResNet 101 AUC was 0.799 (95% CI 0.7175–0.8806), accuracy 0.793, and OR_ResNet101 AUC was 0.751 (95% CI 0.6620–0.8401), accuracy 0.713. Subsequently, Nomogram_A and Nomogram_B were constructed by integrating the SR_SVM model and SR_ResNet 101 model, respectively, with clinical parameters, while Nomogram_C was constructed solely based on clinical indicators. In the test set, Nomogram_A demonstrated the best performance with an AUC of 0.930 (95% CI 0.8913–0.9682) and accuracy was 0.829. Nomogram_B AUC 0.868 (95% CI 0.8102–0.9261) and accuracy was 0.829, while Nomogram_C AUC 0.880 (95% CI 0.8257–0.9349) and accuracy was 0.787. The DeLong test revealed that the diagnostic performance of Nomogram_A based on SR_SVM was significantly higher than that of Nomogram_B, Nomogram_C, and the level of Radiologist (P
- Published
- 2024
- Full Text
- View/download PDF
25. Modelling the temporal trajectories of human milk components
- Author
-
József Baranyi, Tünde Pacza, Mayara L. Martins, Sagar K. Thakkar, and Tinu M. Samuel
- Subjects
Food composition ,Human Milk ,Predictive modelling ,Saturation model ,Longitudinal data ,Error estimation ,Gynecology and obstetrics ,RG1-991 - Abstract
Abstract Background This paper demonstrates how available data can be explored and utilized to conclude generic patterns in the temporal changes in Human Milk (HM) composition. Methods The temporal trajectories of selected human milk components (HMC-s) were described, in the first four months postpartum, by a primary model consisting of two phases: a short linear phase in the colostrum, triggered by the parturition; and a longer second phase, where the concentration of the component converges to a steady state. The model was fitted to data available in a recently published database of temporal HMC trajectories both at the levels of individual molecules (such as specific fatty acid, oligosaccharide, and mineral molecules) and molecule-groups (such as total protein, total fat). Results The properties of the trajectories suggest that experimental designs should follow non-equidistant sampling times, with increasingly longer time intervals after the first week postpartum. A selected parameter, the final stationary level, of the primary model was then studied as a function of geographical location (secondary modelling). Conclusions We found that the total variation of the concentration of specific HMC-s is dominantly due to the inherent biological differences between individual mothers and to less extent to the geographical location.
- Published
- 2024
- Full Text
- View/download PDF
26. Importance of OCT-derived biomarkers for the recurrence of central serous chorioretinopathy using statistics and predictive modelling
- Author
-
Emilien Seiler, Léon Delachaux, Jennifer Cattaneo, Ali Garjani, Thibaud Martin, Alexia Duriez, Jérémy Baffou, Sepehr Mousavi, Ilenia Meloni, Ciara Bergin, Mattia Tomasoni, and Chiara M. Eandi
- Subjects
Central serous chorioretinopathy ,Predictive modelling ,Biomarker ,Retina ,Choroid ,Optical coherence tomography ,Medicine ,Science - Abstract
Abstract Central serous chorioretinopathy (CSCR) is a retinal disease characterised by the accumulation of subretinal fluid, which often resolves spontaneously in acute cases. However, approximately one-third of patients experience recurrences that may cause severe and irreversible vision. This study aimed to identify parameters derived from optical coherence tomography (OCT) that are associated with CSCR recurrence. Our dataset included 5211 OCT scans from 344 eyes of 255 patients diagnosed with CSCR. 178 eyes were identified as recurrent, 109 as non-recurrent, and 57 were excluded. We extracted parameters using artificial intelligence algorithms based on U-Nets, convolutional kernels, and morphological operators. We applied inferential statistics to evaluate differences between the recurrent and non-recurrent groups, and we used a logistic regression predictive model, reporting the coefficients as a measure of biomarker importance. We identified nine predictive biomarkers for CSCR recurrence: age, intraretinal fluid, subretinal fluid, pigment epithelial detachments, choroidal vascularity index, integrity of photoreceptors and retinal pigment epithelium layer, choriocapillaris and choroidal stroma thickness, and thinning of the outer nuclear layer, and of the inner nuclear layer combined with the outer plexiform layer. These results could enable future developments in the automatic detection of CSCR recurrence, paving the way for translational medical applications.
- Published
- 2024
- Full Text
- View/download PDF
27. Seismic site characterization baseline data for microzonation and site response analysis of Otuasega Town, Bayelsa State, Niger Delta region of Nigeria
- Author
-
Gamil M. S. Abdullah, Charles Kennedy, Ashok Kumar, Waleligne Molla Salilew, and Omrane Benjeddou
- Subjects
Geotechnical investigation ,Soil stratigraphy ,Index properties ,Shear wave velocity ,Site classification ,Predictive modelling ,Medicine ,Science - Abstract
Abstract This study presents the findings of a comprehensive geotechnical and seismic site investigation conducted at Otuasega Town located in Bayelsa State within the Niger Delta region of Nigeria. Subsurface exploration involved advancing 10 boreholes to 30 m depth using hollow stem auger drilling. Continuous disturbed and undisturbed soil sampling was performed at 1.5 m intervals for detailed geotechnical testing. Laboratory tests on the recovered soil samples established the index properties, classification, densities and consistency limits of the stratified deposits. The subsurface profile comprised alternating layers of clay, silt and sand typical of deltaic sediments, with the clay fractions exhibiting medium to high plasticity. Shear wave velocity (Vs) profiling using Multichannel Analysis of Surface Waves (WASW) techniques categorised the site predominantly as Site Class C and D based on international standards. The Standard Penetration Test (SPT) N-values ranged from 5 to 10, indicating soft normally consolidated clay conditions typical of the Niger Delta region. Predictive empirical models developed from the field and lab data showed strong correlations for estimating key geotechnical parameters such as SPT blow count, Vs and liquefaction resistance. Ground response analyses using the Vs and SPT data indicated significant site amplification potential, with peak ground accelerations up to 1.5 times the bedrock motion. Liquefaction analysis based on the empirical SPT-based methods revealed a high potential for liquefaction in the sandy layers, especially under strong earthquake shaking. The study characterized the complex sedimentology and provided baseline information for seismic microzonation and site-specific ground response analyses to advance understanding of geohazards in this delta environment.
- Published
- 2024
- Full Text
- View/download PDF
28. Predicting Leukoplakia and Oral Squamous Cell Carcinoma Using Interpretable Machine Learning: A Retrospective Analysis
- Author
-
Salem Shamsul Alam, Saif Ahmed, Taseef Hasan Farook, and James Dudley
- Subjects
oral cancer ,white lesion ,Random Forest classifier ,SHAP ,predictive modelling ,Dentistry ,RK1-715 - Abstract
Purpose: The purpose of this study is to assess the effectiveness of the best performing interpretable machine learning models in the diagnoses of leukoplakia and oral squamous cell carcinoma (OSCC). Methods: A total of 237 patient cases were analysed that included information about patient demographics, lesion characteristics, and lifestyle factors, such as age, gender, tobacco use, and lesion size. The dataset was preprocessed and normalised, and then separated into training and testing sets. The following models were tested: K-Nearest Neighbours (KNN), Logistic Regression, Naive Bayes, Support Vector Machine (SVM), and Random Forest. The overall accuracy, Kappa score, class-specific precision, recall, and F1 score were used to assess performance. SHAP (SHapley Additive ExPlanations) was used to interpret the Random Forest model and determine the contribution of each feature to the predictions. Results: The Random Forest model had the best overall accuracy (93%) and Kappa score (0.90). For OSCC, it had a precision of 0.91, a recall of 1.00, and an F1 score of 0.95. The model had a precision of 1.00, recall of 0.78, and F1 score of 0.88 for leukoplakia without dysplasia. The precision for leukoplakia with dysplasia was 0.91, the recall was 1.00, and the F1 score was 0.95. The top three features influencing the prediction of leukoplakia with dysplasia are buccal mucosa localisation, ages greater than 60 years, and larger lesions. For leukoplakia without dysplasia, the key features are gingival localisation, larger lesions, and tongue localisation. In the case of OSCC, gingival localisation, floor-of-mouth localisation, and buccal mucosa localisation are the most influential features. Conclusions: The Random Forest model outperformed the other machine learning models in diagnosing oral cancer and potentially malignant oral lesions with higher accuracy and interpretability. The machine learning models struggled to identify dysplastic changes. Using SHAP improves the understanding of the importance of features, facilitating early diagnosis and possibly reducing mortality rates. The model notably indicated that lesions on the floor of the mouth were highly unlikely to be dysplastic, instead showing one of the highest probabilities for being OSCC.
- Published
- 2024
- Full Text
- View/download PDF
29. Development of a machine learning predictive model for early detection of breast cancer [version 1; peer review: awaiting peer review]
- Author
-
Rinsy Rahman, Dola Saha, Winniecia Dkhar, Sathyendranath Malli, and Neil Barnes Abraham
- Subjects
Research Article ,Articles ,Breast cancer ,Mammography ,Machine learning ,Tumor classification ,Predictive modelling - Abstract
Background Breast cancer remains a significant global health concern, with over 7.8 million cases reported in the last five years. Early detection and accurate classification are crucial for reducing mortality rates and improving outcomes. Machine learning (ML) has emerged as a transformative tool in medical imaging, enabling more efficient and accurate diagnostic processes. Objective This study aims to develop a machine learning-based predictive model for early detection and classification of breast cancer using the Wisconsin Breast Cancer Diagnostic dataset. Methods The dataset, comprising 569 samples and 32 features derived from fine needle aspirate biopsy images, was pre-processed through data cleaning, normalization using the Robust Scaler, and feature selection. Five supervised ML algorithms—Logistic Regression, Support Vector Classification (SVC) with linear and radial basis function (RBF) kernels, Decision Tree, and Random Forest—were implemented. Models were evaluated using performance metrics, including accuracy, precision, sensitivity, specificity, and F1 scores. Results The SVC-RBF model demonstrated the highest accuracy (98.68%) and balanced performance across other metrics, making it the most effective classifier for distinguishing between benign and malignant tumors. Key features such as texture mean and area (worst) significantly contributed to classification accuracy. Conclusions This study highlights the potential of ML algorithms, particularly SVC-RBF, to revolutionize breast cancer diagnostics through improved accuracy and efficiency. Future research should validate these findings with diverse datasets and explore their integration into clinical workflows to enhance decision-making and patient care.
- Published
- 2025
- Full Text
- View/download PDF
30. Nomogram for predicting cervical lymph node metastasis of papillary thyroid carcinoma using deep learning-based super-resolution ultrasound image.
- Author
-
Li, Xia, Zhao, Yu, Chen, Wenhui, Huang, Xu, Ding, Yan, Cao, Shuangyi, Wang, Chujun, and Zhang, Chunquan
- Subjects
FEATURE extraction ,LYMPHATIC metastasis ,IMAGE reconstruction ,ULTRASONIC imaging ,DEEP learning - Abstract
Objectives: To investigate the feasibility and effectiveness of a deep learning (DL) super-resolution (SR) ultrasound image reconstruction model for predicting cervical lymph node status in patients with papillary thyroid carcinoma(PTC). Methods: In this retrospective study, researchers recruited 544 patients with PTC and randomly assigned them to training and test sets. SR ultrasound images were acquired using SR technology to improve image resolution, and artificial features and DL features were extracted from the original (OR) and SR images, respectively, to construct a ML, DL model. The best model was selected and aggregated with clinical parameters to construct the nomogram. The performance of the model is evaluated by ROC curves, calibration curves and decision curves. Results: In distinguishing the presence or absence of metastatic lymph nodes, the predictive performance of the SR_ResNet 101 and SR_SVM models based on SR outperformed those based on OR. In the test set, SR_SVM AUC was 0.878 (95% CI 0.8203–0.9358), accuracy 0.854, while OR_SVM AUC was 0.822 (95% CI 0.7500–0.8937), accuracy 0.665. SR_ResNet 101 AUC was 0.799 (95% CI 0.7175–0.8806), accuracy 0.793, and OR_ResNet101 AUC was 0.751 (95% CI 0.6620–0.8401), accuracy 0.713. Subsequently, Nomogram_A and Nomogram_B were constructed by integrating the SR_SVM model and SR_ResNet 101 model, respectively, with clinical parameters, while Nomogram_C was constructed solely based on clinical indicators. In the test set, Nomogram_A demonstrated the best performance with an AUC of 0.930 (95% CI 0.8913–0.9682) and accuracy was 0.829. Nomogram_B AUC 0.868 (95% CI 0.8102–0.9261) and accuracy was 0.829, while Nomogram_C AUC 0.880 (95% CI 0.8257–0.9349) and accuracy was 0.787. The DeLong test revealed that the diagnostic performance of Nomogram_A based on SR_SVM was significantly higher than that of Nomogram_B, Nomogram_C, and the level of Radiologist (P < 0.05). The calibration curves and Hosmer–Lemeshow tests confirmed a high degree of fit, and the decision curve analysis demonstrated clinical value and potential patient benefit. Conclusions: The predictive model constructed using SR reconstructed ultrasound images demonstrated superior performance in predicting preoperative cervical lymph node metastasis in PTC compared to OR images. The nomogram prediction model based on SR images has the potential to enhance the accuracy of predictive models and aid in clinical decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Modelling the temporal trajectories of human milk components.
- Author
-
Baranyi, József, Pacza, Tünde, Martins, Mayara L., Thakkar, Sagar K., and Samuel, Tinu M.
- Subjects
BREAST milk ,COMPOSITION of milk ,FOOD composition ,TEMPORAL databases ,COLOSTRUM - Abstract
Background: This paper demonstrates how available data can be explored and utilized to conclude generic patterns in the temporal changes in Human Milk (HM) composition. Methods: The temporal trajectories of selected human milk components (HMC-s) were described, in the first four months postpartum, by a primary model consisting of two phases: a short linear phase in the colostrum, triggered by the parturition; and a longer second phase, where the concentration of the component converges to a steady state. The model was fitted to data available in a recently published database of temporal HMC trajectories both at the levels of individual molecules (such as specific fatty acid, oligosaccharide, and mineral molecules) and molecule-groups (such as total protein, total fat). Results: The properties of the trajectories suggest that experimental designs should follow non-equidistant sampling times, with increasingly longer time intervals after the first week postpartum. A selected parameter, the final stationary level, of the primary model was then studied as a function of geographical location (secondary modelling). Conclusions: We found that the total variation of the concentration of specific HMC-s is dominantly due to the inherent biological differences between individual mothers and to less extent to the geographical location. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A Machine Learning Approach to Identifying Risk Factors for Long COVID-19.
- Author
-
Machado, Rhea, Soorinarain Dodhy, Reshen, Sehgal, Atharve, Rattigan, Kate, Lalwani, Aparna, and Waynforth, David
- Subjects
- *
POST-acute COVID-19 syndrome , *COVID-19 , *DISEASE complications , *COVID-19 vaccines , *COVID-19 pandemic - Abstract
Long-term sequelae of coronavirus disease 2019 (COVID-19) infection are common and can have debilitating consequences. There is a need to understand risk factors for Long COVID-19 to give impetus to the development of targeted yet holistic clinical and public health interventions to reduce its associated healthcare and economic burden. Given the large number and variety of predictors implicated spanning health-related and sociodemographic factors, machine learning becomes a valuable tool. As such, this study aims to employ machine learning to produce an algorithm to predict Long COVID-19 risk, and thereby identify key predisposing factors. Longitudinal cohort data were sourced from the UK's "Understanding Society: COVID-19 Study" (n = 601 participants with past symptomatic COVID-19 infection confirmed by serology testing). The random forest classification algorithm demonstrated good overall performance with 97.4% sensitivity and modest specificity (65.4%). Significant risk factors included early timing of acute COVID-19 infection in the pandemic, greater number of hours worked per week, older age and financial insecurity. Loneliness and having uncommon health conditions were associated with lower risk. Sensitivity analysis suggested that COVID-19 vaccination is also associated with lower risk, and asthma with an increased risk. The results are discussed with emphasis on evaluating the value of machine learning; potential clinical utility; and some benefits and limitations of machine learning for health science researchers given its availability in commonly used statistical software. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Pioneering bioinformatics with agent-based modelling: an innovative protocol to accurately forecast skin or respiratory allergic reactions to chemical sensitizers.
- Author
-
Russo, Giulia, Crispino, Elena, Casati, Silvia, Corsini, Emanuela, Worth, Andrew, and Pappalardo, Francesco
- Subjects
- *
ALLERGENS , *CHEMICAL testing , *CHEMICAL reactions , *ANIMAL experimentation , *COMPUTATIONAL biology - Abstract
The assessment of the allergenic potential of chemicals, crucial for ensuring public health safety, faces challenges in accuracy and raises ethical concerns due to reliance on animal testing. This paper presents a novel bioinformatic protocol designed to address the critical challenge of predicting immune responses to chemical sensitizers without the use of animal testing. The core innovation lies in the integration of advanced bioinformatics tools, including the Universal Immune System Simulator (UISS), which models detailed immune system dynamics. By leveraging data from structural predictions and docking simulations, our approach provides a more accurate and ethical method for chemical safety evaluations, especially in distinguishing between skin and respiratory sensitizers. Our approach integrates a comprehensive eight-step process, beginning with the meticulous collection of chemical and protein data from databases like PubChem and the Protein Data Bank. Following data acquisition, structural predictions are performed using cutting-edge tools such as AlphaFold to model proteins whose structures have not been previously elucidated. This structural information is then utilized in subsequent docking simulations, leveraging both ligand–protein and protein–protein interactions to predict how chemical compounds may trigger immune responses. The core novelty of our method lies in the application of UISS—an advanced agent-based modelling system that simulates detailed immune system dynamics. By inputting the results from earlier stages, including docking scores and potential epitope identifications, UISS meticulously forecasts the type and severity of immune responses, distinguishing between Th1-mediated skin and Th2-mediated respiratory allergic reactions. This ability to predict distinct immune pathways is a crucial advance over current methods, which often cannot differentiate between the sensitization mechanisms. To validate the accuracy and robustness of our approach, we applied the protocol to well-known sensitizers: 2,4-dinitrochlorobenzene for skin allergies and trimellitic anhydride for respiratory allergies. The results clearly demonstrate the protocol's ability to differentiate between these distinct immune responses, underscoring its potential for replacing traditional animal-based testing methods. The results not only support the potential of our method to replace animal testing in chemical safety assessments but also highlight its role in enhancing the understanding of chemical-induced immune reactions. Through this innovative integration of computational biology and immunological modelling, our protocol offers a transformative approach to toxicological evaluations, increasing the reliability of safety assessments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Hidden in Plain Sight: A Data-Driven Approach to Safety Risk Management for Highway Traffic Officers.
- Author
-
Bortey, Loretta, Edwards, David J., Roberts, Chris, and Rille, Iain
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,SUPPORT vector machines ,INFORMATION superhighway ,ARTIFICIAL intelligence - Abstract
Highway traffic officers (HTOs) are often exposed to life-threatening workplace incidents while performing their duties. However, scant research has been undertaken to address these safety concerns. This research explores case study data from highway incident reports (held by National Highways, a UK government company) and employs deep neural network (DNN) in unearthing patterns which inform safety decision makers on pertinent safety challenges confronting HTOs. A mixed philosophical stance of positivism and interpretivism was adopted to synthesise the findings made. A four-phase sequential method was implemented to evaluate the validity of the research viz.: (i) architectural design; (ii) data exploration; (iii) predictive modelling; and (iv) performance evaluation. The DNN model's predictive performance is benchmarked against three other machine learning models, namely Support Vector Machines (SVM), Random Forest (RF), and Naïve Bayes (NB). The DNN model outperformed the other three models. Findings from the data exploration also show that most work operations undertaken by HTOs have a medium risk level with night shifts posing the greatest risk challenges. Carriageways and traffic management enclosures had the highest incident occurrence. This is the first study to uncover such hidden patterns and predict risk levels using a database specifically for HTOs. This study presents evidence-based information for proactive risk management for HTOs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Incorporating functional traits with habitat maps: patterns of diversity in coastal benthic assemblages.
- Author
-
Nemani, Shreya, Misiuk, Benjamin, Cote, David, Edinger, Evan, Mackin-McLaughlin, Julia, Templeton, Adam, and Robert, Katleen
- Subjects
NUMBERS of species ,RANDOM forest algorithms ,LIFE history theory ,TEXTURE mapping ,PREDICTION models - Abstract
Benthic species assemblages are groups of species that co-occur on the seafloor. Linking assemblages to physical environmental features allows for understanding and predicting their spatial distribution. Species identity and abundance are commonly quantified using a taxonomic approach to assess benthic diversity, yet functional traits that describe the behavior, life history, and morphology of a species may be equally or more important. Here, we investigate the biodiversity of five benthic species assemblages in relation to their habitat and environmental conditions in an Ecologically and Biologically Significant Area (EBSA) along Canada's east coast, using both a taxonomic approach and biological traits analysis. Random Forest regression was applied to map spatial patterns of functional and taxonomic diversity metrics, including richness, Shannon index, and Rao's quadratic entropy. We evaluate discrepancies between related taxonomic and trait measures, and the community-weighted mean of trait data was calculated to characterize each assemblage. Taxonomic and functional richness - representing the number of species and the species community volume in the trait space, respectively - showed similar spatial patterns. However, when considering diversity, which also accounts for the relative abundance and differences among species or traits, these patterns diverged. Taxonomically different assemblages exhibited similar trait compositions for two assemblages, indicating potential trait equivalencies, while one assemblage exhibited traits potentially indicating sensitivity to human activity. The taxonomic and functional metrics of richness and diversity were low close to the coast, which could be indicative of disturbance. Consideration of functional metrics can support spatial planning and prioritization for management and conservation efforts by assessing the sensitivity of traits to different stressors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Importance of OCT-derived biomarkers for the recurrence of central serous chorioretinopathy using statistics and predictive modelling.
- Author
-
Seiler, Emilien, Delachaux, Léon, Cattaneo, Jennifer, Garjani, Ali, Martin, Thibaud, Duriez, Alexia, Baffou, Jérémy, Mousavi, Sepehr, Meloni, Ilenia, Bergin, Ciara, Tomasoni, Mattia, and Eandi, Chiara M.
- Subjects
CHOROID ,OPTICAL coherence tomography ,PATIENT experience ,INFERENTIAL statistics ,RETINAL diseases ,RETINAL ganglion cells ,RHODOPSIN - Abstract
Central serous chorioretinopathy (CSCR) is a retinal disease characterised by the accumulation of subretinal fluid, which often resolves spontaneously in acute cases. However, approximately one-third of patients experience recurrences that may cause severe and irreversible vision. This study aimed to identify parameters derived from optical coherence tomography (OCT) that are associated with CSCR recurrence. Our dataset included 5211 OCT scans from 344 eyes of 255 patients diagnosed with CSCR. 178 eyes were identified as recurrent, 109 as non-recurrent, and 57 were excluded. We extracted parameters using artificial intelligence algorithms based on U-Nets, convolutional kernels, and morphological operators. We applied inferential statistics to evaluate differences between the recurrent and non-recurrent groups, and we used a logistic regression predictive model, reporting the coefficients as a measure of biomarker importance. We identified nine predictive biomarkers for CSCR recurrence: age, intraretinal fluid, subretinal fluid, pigment epithelial detachments, choroidal vascularity index, integrity of photoreceptors and retinal pigment epithelium layer, choriocapillaris and choroidal stroma thickness, and thinning of the outer nuclear layer, and of the inner nuclear layer combined with the outer plexiform layer. These results could enable future developments in the automatic detection of CSCR recurrence, paving the way for translational medical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Leveraging Machine Learning for Optimized Mechanical Properties and 3D Printing of PLA/cHAP for Bone Implant.
- Author
-
Omigbodun, Francis T., Osa-Uwagboe, Norman, Udu, Amadi Gabriel, and Oladapo, Bankole I.
- Subjects
- *
MACHINE learning , *BIOMEDICAL engineering , *TISSUE engineering , *TISSUE scaffolds , *CANCELLOUS bone , *POLYLACTIC acid - Abstract
This study explores the fabrication and characterisation of 3D-printed polylactic acid (PLA) scaffolds reinforced with calcium hydroxyapatite (cHAP) for bone tissue engineering applications. By varying the cHAP content, we aimed to enhance PLA scaffolds' mechanical and thermal properties, making them suitable for load-bearing biomedical applications. The results indicate that increasing cHAP content improves the tensile and compressive strength of the scaffolds, although it also increases brittleness. Notably, incorporating cHAP at 7.5% and 10% significantly enhances thermal stability and mechanical performance, with properties comparable to or exceeding those of human cancellous bone. Furthermore, this study integrates machine learning techniques to predict the mechanical properties of these composites, employing algorithms such as XGBoost and AdaBoost. The models demonstrated high predictive accuracy, with R2 scores of 0.9173 and 0.8772 for compressive and tensile strength, respectively. These findings highlight the potential of using data-driven approaches to optimise material properties autonomously, offering significant implications for developing custom-tailored scaffolds in bone tissue engineering and regenerative medicine. The study underscores the promise of PLA/cHAP composites as viable candidates for advanced biomedical applications, particularly in creating patient-specific implants with improved mechanical and thermal characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Hindcasting long‐term data unveils the influence of a changing climate on small mammal communities.
- Author
-
Lupone, Luke, Cooke, Raylene, Rendall, Anthony R., Siegrist, Angelina, Penton, Cara, Carlyon, Matt, Ouchtomsky, Tim, and White, John G.
- Subjects
- *
CLIMATE change adaptation , *NATIVE species , *MAMMAL communities , *PREDICTION models , *CLIMATE change - Abstract
Aim: Shifting climates are reshaping ecosystems globally and are projected to intensify over the coming century. Understanding how biodiversity will respond to these shifts is crucial for developing effective climate adaptation measures. We generate predictive models built from long‐term data to hindcast historic fluctuations in small mammal abundances as they have responded to shifting rainfall and fire conditions. This data set serves as the basis for predicting historical variations (hindcasting) in small mammal abundances, allowing us to examine their responses to decadal changes in fire and rainfall conditions within our study landscape. Location: Australia (Victoria). Taxa: Small mammals (Mammalia). Time Period: 1970–2022. Methods: Small mammal abundance was surveyed at 36 long‐term trapping sites and modelled against coinciding fire history, vegetation productivity and rainfall using generalized additive mixed models. Six species were then used in predictive modelling against these variables for the decades preceding our monitoring programme (1970–2007). Results: All species abundances increased with higher rainfall. Time since fire was also an important variable in all but one species model, with species displaying varying responses to time since fire. Hindcasting predictions for small mammal abundances varied with some species showing marked declines over time. Clear trends emerged, indicating more volatile population fluctuations in response to intensified fire and rainfall extremes in the 21st century. This suggests that periods of higher rainfall and less frequent fire events in the decades preceding our monitoring period supported higher and more stable small mammal abundances. Conclusions: Native species show distinct sensitivity to the combined effects of drought and fire, which has occurred in recent times. Intensification of these drivers has caused increased volatility in small mammal abundances with low abundance extremes occurring more frequently. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Seismic site characterization baseline data for microzonation and site response analysis of Otuasega Town, Bayelsa State, Niger Delta region of Nigeria.
- Author
-
Abdullah, Gamil M. S., Kennedy, Charles, Kumar, Ashok, Salilew, Waleligne Molla, and Benjeddou, Omrane
- Subjects
SOIL sampling ,DRILL stem ,SHEAR waves ,SEDIMENTOLOGY ,EARTHQUAKES - Abstract
This study presents the findings of a comprehensive geotechnical and seismic site investigation conducted at Otuasega Town located in Bayelsa State within the Niger Delta region of Nigeria. Subsurface exploration involved advancing 10 boreholes to 30 m depth using hollow stem auger drilling. Continuous disturbed and undisturbed soil sampling was performed at 1.5 m intervals for detailed geotechnical testing. Laboratory tests on the recovered soil samples established the index properties, classification, densities and consistency limits of the stratified deposits. The subsurface profile comprised alternating layers of clay, silt and sand typical of deltaic sediments, with the clay fractions exhibiting medium to high plasticity. Shear wave velocity (Vs) profiling using Multichannel Analysis of Surface Waves (WASW) techniques categorised the site predominantly as Site Class C and D based on international standards. The Standard Penetration Test (SPT) N-values ranged from 5 to 10, indicating soft normally consolidated clay conditions typical of the Niger Delta region. Predictive empirical models developed from the field and lab data showed strong correlations for estimating key geotechnical parameters such as SPT blow count, Vs and liquefaction resistance. Ground response analyses using the Vs and SPT data indicated significant site amplification potential, with peak ground accelerations up to 1.5 times the bedrock motion. Liquefaction analysis based on the empirical SPT-based methods revealed a high potential for liquefaction in the sandy layers, especially under strong earthquake shaking. The study characterized the complex sedimentology and provided baseline information for seismic microzonation and site-specific ground response analyses to advance understanding of geohazards in this delta environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Predicting Leukoplakia and Oral Squamous Cell Carcinoma Using Interpretable Machine Learning: A Retrospective Analysis.
- Author
-
Alam, Salem Shamsul, Ahmed, Saif, Farook, Taseef Hasan, and Dudley, James
- Subjects
MACHINE learning ,ORAL leukoplakia ,RANDOM forest algorithms ,SQUAMOUS cell carcinoma ,SUPPORT vector machines - Abstract
Purpose: The purpose of this study is to assess the effectiveness of the best performing interpretable machine learning models in the diagnoses of leukoplakia and oral squamous cell carcinoma (OSCC). Methods: A total of 237 patient cases were analysed that included information about patient demographics, lesion characteristics, and lifestyle factors, such as age, gender, tobacco use, and lesion size. The dataset was preprocessed and normalised, and then separated into training and testing sets. The following models were tested: K-Nearest Neighbours (KNN), Logistic Regression, Naive Bayes, Support Vector Machine (SVM), and Random Forest. The overall accuracy, Kappa score, class-specific precision, recall, and F1 score were used to assess performance. SHAP (SHapley Additive ExPlanations) was used to interpret the Random Forest model and determine the contribution of each feature to the predictions. Results: The Random Forest model had the best overall accuracy (93%) and Kappa score (0.90). For OSCC, it had a precision of 0.91, a recall of 1.00, and an F1 score of 0.95. The model had a precision of 1.00, recall of 0.78, and F1 score of 0.88 for leukoplakia without dysplasia. The precision for leukoplakia with dysplasia was 0.91, the recall was 1.00, and the F1 score was 0.95. The top three features influencing the prediction of leukoplakia with dysplasia are buccal mucosa localisation, ages greater than 60 years, and larger lesions. For leukoplakia without dysplasia, the key features are gingival localisation, larger lesions, and tongue localisation. In the case of OSCC, gingival localisation, floor-of-mouth localisation, and buccal mucosa localisation are the most influential features. Conclusions: The Random Forest model outperformed the other machine learning models in diagnosing oral cancer and potentially malignant oral lesions with higher accuracy and interpretability. The machine learning models struggled to identify dysplastic changes. Using SHAP improves the understanding of the importance of features, facilitating early diagnosis and possibly reducing mortality rates. The model notably indicated that lesions on the floor of the mouth were highly unlikely to be dysplastic, instead showing one of the highest probabilities for being OSCC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Deep Learning for Predicting Hydrogen Solubility in n-Alkanes: Enhancing Sustainable Energy Systems.
- Author
-
Tatar, Afshin, Shokrollahi, Amin, Zeinijahromi, Abbas, and Haghighi, Manouchehr
- Abstract
As global population growth and urbanisation intensify energy demands, the quest for sustainable energy sources gains paramount importance. Hydrogen (H
2 ) emerges as a versatile energy carrier, contributing to diverse processes in energy systems, industrial applications, and scientific research. To harness the H2 potential effectively, a profound grasp of its thermodynamic properties across varied conditions is essential. While field and laboratory measurements offer accuracy, they are resource-intensive. Experimentation involving high-pressure and high-temperature conditions poses risks, rendering precise H2 solubility determination crucial. This study evaluates the application of Deep Neural Networks (DNNs) for predicting H2 solubility in n-alkanes. Three DNNs are developed, focusing on model structure and overfitting mitigation. The investigation utilises a comprehensive dataset, employing distinct model structures. Our study successfully demonstrates that the incorporation of dropout layers and batch normalisation within DNNs significantly mitigates overfitting, resulting in robust and accurate predictions of H2 solubility in n-alkanes. The DNN models developed not only perform comparably to traditional ensemble methods but also offer greater stability across varying training conditions. These advancements are crucial for the safe and efficient design of H2 -based systems, contributing directly to cleaner energy technologies. Understanding H2 solubility in hydrocarbons can enhance the efficiency of H2 storage and transportation, facilitating its integration into existing energy systems. This advancement supports the development of cleaner fuels and improves the overall sustainability of energy production, ultimately contributing to a reduction in reliance on fossil fuels and minimising the environmental impact of energy generation. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
42. Development of a Decision Tree Classifier for Breast Cancer Diagnosis Using Fine Needle Aspirate Data
- Author
-
Agus Halid, I Gusti Ngurah Wikranta Arsa, Rezania Agramanisti Azdy, and Agus Aan Jiwa Permana
- Subjects
Breast Cancer ,Cross-Validation ,Decision Tree ,Machine Learning ,Medical Diagnostics ,Predictive Modelling ,Computer software ,QA76.75-76.765 - Abstract
Breast cancer is one of the leading causes of mortality among women globally, necessitating early and accurate detection to improve survival rates. This study leverages machine learning to develop a decision tree classifier for distinguishing between benign and malignant breast masses using the Kaggle Breast Cancer FNA dataset. The dataset underwent rigorous pre-processing, including the removal of irrelevant columns, data cleaning, label encoding, and feature scaling. The model was evaluated using 5-fold cross-validation, achieving an average accuracy of 84.0%, with a test set accuracy of 83.72%. Performance metrics such as precision, recall, and F1-score further validated the model's robustness, with an overall accuracy of 90.24% on the test set. The decision tree classifier demonstrated high interpretability, making it a practical tool for aiding clinical decision-making. While the results are promising, the study highlights opportunities for improvement, including the use of ensemble methods and larger datasets to enhance generalizability. This research contributes to the growing body of evidence supporting machine learning applications in medical diagnostics, particularly in breast cancer detection.
- Published
- 2024
- Full Text
- View/download PDF
43. Predictive modelling of the UK physician associate supply: 2014–2038
- Author
-
Emyr Yosef Bakker, Peter Anthony Dixon, Tim Smith, and Jane Frances Rutt-Howard
- Subjects
Physician associates ,Physician assistants ,Medical associate professionals ,Health workforce ,Predictive modelling ,Medicine - Abstract
Introduction: The NHS Long Term Workforce Plan aims for 10,000 physician associates (PAs, formerly physician assistants) by 2036/7. This article uses three modelling approaches to project the UK PA supply from a baseline of 2014–2021 through to 2038 to forecast the profession's growth. Methods: The number of cClinically available PAs’ (cPAs; qualified PAs either working clinically or seeking clinical employment) was estimated using raw data from the 2014–2021 Faculty of Physician Associates censuses. This provided baseline data for all models (linear regression (LRM), exponential regression (ERM) and time-series forecast (TSFM)). Attrition, using data from other healthcare professions, was also modelled. Results: R2 values together with authors’ judgement ruled the LRM more realistic than the ERM. The LRM projected up to 8,232 cPAs by 2038, although attrition reduced this significantly. The TSFM optimistically projected an upper limit (95% confidence interval) of 13,922 cPAs by 2038. Discussion: This article permits a wider view of potential PA numbers, with broad agreement between the LRM and the TSFM. It appears that future PA demand will be met, but factors such as attrition could impede this. Attrition itself may be mitigated through adequate resourcing, appropriate support mechanisms, and the development of a career structure. Professional regulation and legislation will further support PAs to work to their potential, subject to appropriate patient safety measures.
- Published
- 2024
- Full Text
- View/download PDF
44. FOX-TSA hybrid algorithm: Advancing for superior predictive accuracy in tourism-driven multi-layer perceptron models
- Author
-
Sirwan A. Aula and Tarik A. Rashid
- Subjects
Hybrid FOX-TSA ,Multi-layer perceptron ,Tourism industry ,Nature-inspired algorithms ,Predictive modelling ,Information technology ,T58.5-58.64 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Nature-inspired optimization models have received a great deal of interest due to the performance of these algorithms in solving resourceful and authentic problems. However, achieving high predictive accuracy in machine learning models for specialized domains, such as the tourism industry, remains challenging. Predictive modelling in tourism is vital for improving decision-making, including forecasting visitor behaviours and enhancing customer experiences. As the volume and complexity of tourism data increase, there is a need for optimization methods that enhance model training while effectively handling intricate datasets. This study proposes a hybrid FOX-TSA algorithm to optimize the MLP model. The hybrid algorithm synergises the Fox Optimization Algorithm's exploration capabilities with the Tree-Seed Algorithm's exploitation strengths. Using a tourism dataset with user preferences and ratings, the performance of the anticipated algorithm is compared with standalone FOX, TSA, PSO, and GWO algorithms. Results indicate that the hybrid FOX-TSA achieves superior predictive accuracy (94.64 %), faster convergence speed (reducing iterations by 25 %), and improved F1-score (94.63 %) on the test dataset. These findings underline the potential of the hybrid FOX-TSA algorithm to advance predictive modelling in the tourism sector and other domains requiring complex data handling.
- Published
- 2024
- Full Text
- View/download PDF
45. Exploring the risk factors and clustering patterns of periodontitis in patients with different subtypes of diabetes through machine learning and cluster analysis
- Author
-
Anna Zhao, Yuxiang Chen, Haoran Yang, Tingting Chen, Xianqi Rao, and Ziliang Li
- Subjects
Periodontitis ,diabetes mellitus ,consistent consensus ,cluster A ,machine learning ,predictive modelling ,Dentistry ,RK1-715 - Abstract
Aim: To analyse the risk factors contributing to the prevalence of periodontitis among clusters of patients with diabetes and to examine the clustering patterns of clinical blood biochemical indicators. Materials and methods: Data regarding clinical blood biochemical indicators and periodontitis prevalence among 1804 patients with diabetes were sourced from the National Health and Nutrition Examination Survey (NHANES) database spanning 2009 to 2014. A clinical prediction model for periodontitis risk in patients with diabetes was constructed via the XGBoost machine learning method. Furthermore, the relationships between diabetes patient clusters and periodontitis prevalence were investigated through consistent consensus clustering analysis. Results: Seventeen clinical blood biochemical indicators emerged as superior predictors of periodontitis in patients with diabetes. Patients with diabetes were subsequently categorized into two subtypes: Cluster A presented a slightly lower periodontitis prevalence (74.80%), whereas Cluster B presented a higher prevalence risk (83.68%). Differences between the two groups were considered statistically significant at a p value of ≤0.05. There was marked variability in the associations of different cluster characteristics with periodontitis prevalence. Conclusions: Machine learning combined with consensus clustering analysis revealed a greater prevalence of periodontitis among patients with diabetes mellitus in Cluster B. This cluster was characterized by a smoking habit, a lower education level, a higher income-to-poverty ratio, and higher levels of albumin (ALB g/L) and alanine aminotransferase (ALT U/L).
- Published
- 2024
- Full Text
- View/download PDF
46. Predictive and mediation model for decision-making in the context of dynamic capabilities and knowledge management
- Author
-
Bocoya-Maline, José, Calvo-Mora, Arturo, and Rey Moreno, Manuel
- Published
- 2024
- Full Text
- View/download PDF
47. Risk factors and a prediction model of severe asparaginase-associated pancreatitis in children
- Author
-
Lin, Long, Yang, Kai-Hua, Chen, Chang-Cheng, Shen, Shu-Hong, Hu, Wen-Ting, and Deng, Zhao-Hui
- Published
- 2024
- Full Text
- View/download PDF
48. Exploratory risk prediction of type II diabetes with isolation forests and novel biomarkers
- Author
-
Hibba Yousef, Samuel F. Feng, and Herbert F. Jelinek
- Subjects
Diabetes ,Inflammation ,Oxidative stress ,Mitochondrial dysfunction ,Isolation forest ,Predictive modelling ,Medicine ,Science - Abstract
Abstract Type II diabetes mellitus (T2DM) is a rising global health burden due to its rapidly increasing prevalence worldwide, and can result in serious complications. Therefore, it is of utmost importance to identify individuals at risk as early as possible to avoid long-term T2DM complications. In this study, we developed an interpretable machine learning model leveraging baseline levels of biomarkers of oxidative stress (OS), inflammation, and mitochondrial dysfunction (MD) for identifying individuals at risk of developing T2DM. In particular, Isolation Forest (iForest) was applied as an anomaly detection algorithm to address class imbalance. iForest was trained on the control group data to detect cases of high risk for T2DM development as outliers. Two iForest models were trained and evaluated through ten-fold cross-validation, the first on traditional biomarkers (BMI, blood glucose levels (BGL) and triglycerides) alone and the second including the additional aforementioned biomarkers. The second model outperformed the first across all evaluation metrics, particularly for F1 score and recall, which were increased from 0.61 ± 0.05 to 0.81 ± 0.05 and 0.57 ± 0.06 to 0.81 ± 0.08, respectively. The feature importance scores identified a novel combination of biomarkers, including interleukin-10 (IL-10), 8-isoprostane, humanin (HN), and oxidized glutathione (GSSG), which were revealed to be more influential than the traditional biomarkers in the outcome prediction. These results reveal a promising method for simultaneously predicting and understanding the risk of T2DM development and suggest possible pharmacological intervention to address inflammation and OS early in disease progression.
- Published
- 2024
- Full Text
- View/download PDF
49. Deep learning-based prediction of plant height and crown area of vegetable crops using LiDAR point cloud
- Author
-
Reji J and Rama Rao Nidamanuri
- Subjects
Crop height ,Crown area ,Deep learning ,LiDAR point cloud ,Precision agriculture ,Predictive modelling ,Medicine ,Science - Abstract
Abstract Remote sensing has been increasingly used in precision agriculture. Buoyed by the developments in the miniaturization of sensors and platforms, contemporary remote sensing offers data at resolutions finer enough to respond to within-farm variations. LiDAR point cloud, offers features amenable to modelling structural parameters of crops. Early prediction of crop growth parameters helps farmers and other stakeholders dynamically manage farming activities. The objective of this work is the development and application of a deep learning framework to predict plant-level crop height and crown area at different growth stages for vegetable crops. LiDAR point clouds were acquired using a terrestrial laser scanner on five dates during the growth cycles of tomato, eggplant and cabbage on the experimental research farms of the University of Agricultural Sciences, Bengaluru, India. We implemented a hybrid deep learning framework combining distinct features of long-term short memory (LSTM) and Gated Recurrent Unit (GRU) for the predictions of plant height and crown area. The predictions are validated with reference ground truth measurements. These predictions were validated against ground truth measurements. The findings demonstrate that plant-level structural parameters can be predicted well ahead of crop growth stages with around 80% accuracy. Notably, the LSTM and the GRU models exhibited limitations in capturing variations in structural parameters. Conversely, the hybrid model offered significantly improved predictions, particularly for crown area, with error rates for height prediction ranging from 5 to 12%, with deviations exhibiting a more balanced distribution between overestimation and underestimation This approach effectively captured the inherent temporal growth pattern of the crops, highlighting the potential of deep learning for precision agriculture applications. However, the prediction quality is relatively low at the advanced growth stage, closer to the harvest. In contrast, the prediction quality is stable across the three different crops. The results indicate the presence of a robust relationship between the features of the LiDAR point cloud and the auto-feature map of the deep learning methods adapted for plant-level crop structural characterization. This approach effectively captured the inherent temporal growth pattern of the crops, highlighting the potential of deep learning for precision agriculture applications.
- Published
- 2024
- Full Text
- View/download PDF
50. Data on battery health and performance: Analysing Samsung INR21700-50E cells with advanced feature engineering
- Author
-
Sahar Qaadan, Aiman Alshare, Alexander Popp, Myrel Tiemann, Utz Spaeth, and Benedikt Schmuelling
- Subjects
Battery dataset ,Feature engineering ,State of health ,Predictive modelling ,Samsung INR21700-50E ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Science (General) ,Q1-390 - Abstract
This dataset provides a comprehensive collection of detailed measurements from 256 Samsung INR21700-50E cells, spanning 32 batches. It uniquely combines raw data and engineered features derived from charge-discharge cycles and Hybrid Pulse Power Characterization tests. The engineered features—such as State of Health, internal resistance, capacity fade, and energy efficiency—offer critical insights into battery health and aging processes. These features are indispensable for predictive modelling, lifecycle management, and battery performance optimization, addressing key challenges in battery technology. This dataset is particularly valuable for advanced machine learning applications, enabling accurate battery state-of-health estimation and predictive maintenance. The engineered features, including cumulative cycles and dynamic resistance, further enhance the dataset's capacity to model battery behavior under diverse conditions. With batch-specific organisation and CSV format, this dataset facilitates seamless integration into a wide range of analyses, making it a vital resource for researchers and engineers focusing on battery degradation, energy storage systems, and developing robust predictive models for real-world applications.
- Published
- 2025
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.