158 results on '"Ensemble algorithms"'
Search Results
2. COVID-19 from symptoms to prediction: A statistical and machine learning approach
- Author
-
Fakieh, Bahjat and Saleem, Farrukh
- Published
- 2024
- Full Text
- View/download PDF
3. Interpretable ensemble machine learning models for predicting the shear capacity of UHPC joints
- Author
-
Ye, Meng, Li, Lifeng, Jin, Weimeng, Tang, Jiahao, Yoo, Doo-Yeol, and Zhou, Cong
- Published
- 2024
- Full Text
- View/download PDF
4. Optimizing shared bike systems for economic gain: Integrating land use and retail
- Author
-
Bencekri, Madiha, Van Fan, Yee, Lee, Doyun, Choi, Minje, and Lee, Seungjae
- Published
- 2024
- Full Text
- View/download PDF
5. Clinical and Acquisition Data for Optimizing MGMT Methylation Status Prediction: A Comprehensive Ensemble Strategy Emphasizing Non-invasive Approaches
- Author
-
Miteva, Mariya, Nisheva-Pavlova, Maria, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Koprinkova-Hristova, Petia, editor, and Kasabov, Nikola, editor
- Published
- 2025
- Full Text
- View/download PDF
6. Optimizing landslide susceptibility mapping using integrated forest by penalizing attributes model with ensemble algorithms.
- Author
-
Chen, Wei, Wang, Chao, Zhao, Xia, Bai, Li, He, Qingfeng, Chen, Xi, Zhao, Qifei, Zhao, Ruixin, Li, Tao, Tsangaratos, Paraskevas, and Ilia, Ioanna
- Abstract
Landslide, as a significant global natural hazard, threatening human settlements and the natural environment. The present study introduces a novel approach to landslide susceptibility assessment by integrating the Forest Attribute Penalty (FPA) model with three ensemble algorithms—AdaBoost (AB), Rotation Forest (RF), and Random Subspace (RS)—and utilizing the Evidential Belief Function (EBF) to weight the classes of landslide-related factors. To evaluate the performance of the developed methodology, Yanchuan County, China, was chosen as the appropriate study area. Three hundred and eleven landslide areas were identified through remote sensing and field investigations, which were randomly divided into 70% for model training and 30% for model evaluation, whereas sixteen landslide – related factors were considered, such as elevation, slope aspect, profile curvature, plan curvature, convergence index, slope length, terrain ruggedness index, topographic position index, distance to roads, distance to rivers, NDVI, land use, soil, rainfall, and lithology. EBF was employed to analyze the spatial correlation between these factors and landslide occurrences, providing the class weights of each factor for the implementation of FPA and the ensemble models. The next step involved the generation of the landslide susceptibility maps based on the models, with findings showing that more than half of the study area is classified as very low susceptibility. Model performance was assessed using receiver operating characteristic (ROC) curves and other statistical metrics, with the RFFPA model achieving the highest predictive ability, with AUC values of 0.878 and 0.890 for training and validation datasets, respectively. The AFPA and RSFPA hybrid models, however, demonstrated weaker predictive abilities compared to the FPA model. The study highlights the importance of optimizing model performance and evaluating the suitability of ensemble approaches, emphasizing the role of topographical and environmental settings in influencing model accuracy. The use of EBF for weight calculation proved crucial in improving model outcomes, suggesting that this approach could be further refined and adapted to other regions with similar geomorphological settings for better land use planning and risk management. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
7. A short-term photovoltaic output power forecasting based on ensemble algorithms using hyperparameter optimization.
- Author
-
Basaran, Kivanc, Çelikten, Azer, and Bulut, Hasan
- Subjects
- *
STANDARD deviations , *PHOTOVOLTAIC power systems , *SOLAR energy , *POWER resources , *ELECTRIC power distribution grids - Abstract
The stochastic and intermittent nature of solar energy presents the power grid with the challenge of providing a stable, secure, and economical power supply, especially in the case of large-scale penetration. The prerequisite for addressing these challenges is accurate power output estimation from PV systems. In addition, accurate power estimation also ensures the correct sizing of PV systems for investors. In this study, the PV output prediction model has been developed based on ensemble algorithms using two years of real power and meteorological data from grid-connected PV systems. Grid search, random search, and Bayesian optimization were used to determine the optimal hyperparameters for ensemble algorithms. The originality of this study is that (i) the use of hyperparameter optimization for ensemble algorithms in predicting PV performance, (ii) the degradation rate of PV panels by ensemble algorithms using the first two years' data, and (iii) the performance comparison of ensemble algorithms using the hyperparameter optimization technique. The accuracy and precision of the prediction model are determined by the relative root mean square error (RMSE), mean absolute error (MAE), mean bias error (MBE), mean scaled error (MSE), coefficient of determination (R2), mean absolute percentage error (MAPE), and maximum absolute error (MaxAE). To the best of our knowledge, this is one of the first studies to address the optimization of all hyperparameters to find the best parameters for ensemble algorithms and PV panel degradation rates. The results show that the CatBoost algorithm has better performance than the other algorithms used. The performance metrics of the CatBoost algorithm were determined to be 0.9327 R2, 0.047 MSE, 0.0388 MAE, 0.0003 MBE, 0.069 RMSE, 18.7 MAPE, and 0.79 MaxAE. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Assessment of Ensemble-Based Machine Learning Algorithms for Exoplanet Identification.
- Author
-
Luz, Thiago S. F., Braga, Rodrigo A. S., and Ribeiro, Enio R.
- Subjects
STARS ,RANDOM forest algorithms ,ALGORITHMS ,EXTRASOLAR planets ,MATRICES (Mathematics) ,CLASSIFICATION - Abstract
This paper presents a comprehensive assessment procedure for evaluating Ensemble-based Machine Learning algorithms in the context of exoplanet classification. Each of the algorithm hyperparameter values were tuned. Deployments were carried out using the cross-validation method. Performance metrics, including accuracy, sensitivity, specificity, precision, and F1 score, were evaluated using confusion matrices generated from each implementation. Machine Learning (ML) algorithms were trained and used to identify exoplanet data. Most of the current research deals with traditional ML algorithms for this purpose. The Ensemble algorithm is another type of ML technique that combines the prediction performance of two or more algorithms to obtain an improved final prediction. Few studies have applied Ensemble algorithms to predict exoplanets. To the best of our knowledge, no paper that has exclusively assessed Ensemble algorithms exists, highlighting a significant gap in the literature about the potential of Ensemble methods. Five Ensemble algorithms were evaluated in this paper: Adaboost, Random Forest, Stacking, Random Subspace Method, and Extremely Randomized Trees. They achieved an average performance of more than 80% in all metrics. The results underscore the substantial benefits of fine tuning hyperparameters to enhance predictive performance. The Stacking algorithm achieved a higher performance than the other algorithms. This aspect is discussed in this paper. The results of this work show that it is worth increasing the use of Ensemble algorithms to improve exoplanet identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Research on Hard Rock Pillar Stability Prediction Based on SABO-LSSVM Model.
- Author
-
Xie, Xuebin and Zhang, Huaxi
- Subjects
SUPPORT vector machines ,ROCK music ,PREDICTION models ,SAMPLING methods ,MINES & mineral resources ,REDUNDANCY in engineering - Abstract
The increase in mining depth necessitates higher strength requirements for hard rock pillars, making mine pillar stability analysis crucial for pillar design and underground safety operations. To enhance the accuracy of predicting the stability state of mine pillars, a prediction model based on the subtraction-average-based optimizer (SABO) for hyperparameter optimization of the least-squares support vector machine (LSSVM) is proposed. First, by analyzing the redundancy of features in the mine pillar dataset and conducting feature selection, five parameter combinations were constructed to examine their effects on the performance of different models. Second, the SABO-LSSVM prediction model was compared vertically with classic models and horizontally with other optimized models to ensure comprehensive and objective evaluation. Finally, two data sampling methods and a combined sampling method were used to correct the bias of the optimized model for different categories of mine pillars. The results demonstrated that the SABO-LSSVM model exhibited good accuracy and comprehensive performance, thereby providing valuable insights for mine pillar stability prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Product pricing solutions using hybrid machine learning algorithm.
- Author
-
Namburu, Anupama, Selvaraj, Prabha, and Varsha, M.
- Abstract
E-commerce platforms have been around for over two decades now, and their popularity among buyers and sellers alike has been increasing. With the COVID-19 pandemic, there has been a boom in online shopping, with many sellers moving their businesses towards e-commerce platforms. Product pricing is quite difficult at this increased scale of online shopping, considering the number of products being sold online. For instance, the strong seasonal pricing trends in clothes—where Brand names seem to sway the prices heavily. Electronics, on the other hand, have product specification-based pricing, which keeps fluctuating. This work aims to help business owners price their products competitively based on similar products being sold on e-commerce platforms based on the reviews, statistical and categorical features. A hybrid algorithm X-NGBoost combining extreme gradient boost (XGBoost) with natural gradient boost (NGBoost) is proposed to predict the price. The proposed model is compared with the ensemble models like XGBoost, LightBoost and CatBoost. The proposed model outperforms the existing ensemble boosting algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. The study of indoor particulate matter in office buildings based on artificial intelligence
- Author
-
Soleimani-Alyar, S., Soleimani-Alyar, M., Yarahmadi, R., Beyk-Mohammadloo, P., and Fazeli, P.
- Published
- 2024
- Full Text
- View/download PDF
12. Efficient Ensemble Learning-Based Models for Plastic Hinge Length Prediction of Reinforced Concrete Shear Walls
- Author
-
Naser Safaeian Hamzehkolaei and Mohammad Sadegh Barkhordari
- Subjects
reinforced concrete wall (rcsw) ,plastic hinge length (phl) ,machine learning ,ensemble algorithms ,gradient boosting regressor (gbr) ,shapley additive explanations (shape) ,Technology - Abstract
Reinforced concrete shear wall (RCSW) significantly improves the seismic performance of buildings. Accurate estimation of the plastic hinge length (PHL) of RCSWs is crucial as it significantly impacts the plastic deformation, ultimate displacement, and ductility capacity of RCSWs. This study aims to develop practical machine-learning (ML) models for PHL prediction of RCSWs. For this purpose, 721 data of nonplanar and rectangular RCSWs were utilized. Deep neural network-based ensemble learning models including Simple Averaging Ensemble (SAE), Stacking Ensemble (SE), Snapshot Ensemble (SSE), and Deep Forest (DP), were leveraged. Meanwhile, inherently ensemble-learning-based (IELB) algorithms including the XGBoost, RandomForest, CatBoost, HistGradientBoosting, AdaBoost, Bagging, ExtraTrees, and GradientBoosting regressor, and data-driven empirical equations were considered for comparison. The Taylor diagram and statistical comparison of the results revealed that the proposed SE model with the gradient boosting regressor (GBR) meta-learner (MAE=0.043, MSE=0.012, R2=0.916) outperformed all employed deep-and IELB-based ensemble algorithms as well as the empirical formulas for PHL of RCSWs. The SHapley Additive exPlanations-based model interpretation together with Sobol's sensitivity analysis results revealed that the wall length is the most crucial input variable, followed by the effective height and the axial load ratio.
- Published
- 2024
- Full Text
- View/download PDF
13. Online weighted Q-ensembles for reduced hyperparameter tuning in reinforcement learning.
- Author
-
Garcia, Renata and Caarls, Wouter
- Subjects
- *
DEEP reinforcement learning , *REINFORCEMENT learning , *MACHINE learning , *ROBOT control systems , *ONLINE education , *DEEP learning - Abstract
Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for choosing the best performing set(s) on-line. In the literature, the ensemble technique is used to improve performance in general, but the current work specifically addresses decreasing the hyperparameter tuning effort. Furthermore, our approach targets on-line learning on a single robotic system, and does not require running multiple simulators in parallel. Although the idea is generic, the Deep Deterministic Policy Gradient was the model chosen, being a representative deep learning actor-critic method with good performance in continuous action settings but known high variance. We compare our online weighted q-ensemble approach to q-average ensemble strategies addressed in literature using alternate policy training, as well as online training, demonstrating the advantage of the new approach in eliminating hyperparameter tuning. The applicability to real-world systems was validated in common robotic benchmark environments: the bipedal robot half cheetah and the swimmer. Online weighted Q-Ensemble presented overall lower variance and superior results when compared with q-average ensembles using randomized parameterizations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A Nature-Inspired Concept Drift Adaptation Method for Industrial Data Stream Regression
- Author
-
Trat, Martin, Bergmann, Philipp, Ott, Andreas, Ovtcharova, Jivka, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Tolio, Tullio A. M., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Schmitt, Robert, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Wang, Yi-Chi, editor, Chan, Siu Hang, editor, and Wang, Zih-Huei, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Supervised Machine Learning Algorithms for the Analysis of Ship Engine Data
- Author
-
Dimitriou, Theodoros, Skondras, Emmanouil, Hitiris, Christos, Gkola, Cleopatra, Papapanagiotou, Ioannis S., Vergados, Dimitrios J., Papapanagiotou, Stavros I., Koumantakis, Stratos, Michalas, Angelos, Vergados, Dimitrios D., Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Maglaras, Leandros A., editor, and Douligeris, Christos, editor
- Published
- 2024
- Full Text
- View/download PDF
16. A More Effective Ensemble ML Method for Detecting Breast Cancer
- Author
-
Ferdous, Most. Jannatul, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Asirvatham, David, editor, Gonzalez-Longatt, Francisco M., editor, Falkowski-Gilski, Przemyslaw, editor, and Kanthavel, R., editor
- Published
- 2024
- Full Text
- View/download PDF
17. Application of Text Analysis and Ensemble Algorithms in Forecasting Companies Bankruptcy
- Author
-
Drogovoz, Pavel A., Nevredinov, Alexandr R., Pisello, Anna Laura, Editorial Board Member, Hawkes, Dean, Editorial Board Member, Bougdah, Hocine, Editorial Board Member, Rosso, Federica, Editorial Board Member, Abdalla, Hassan, Editorial Board Member, Boemi, Sofia-Natalia, Editorial Board Member, Mohareb, Nabil, Editorial Board Member, Mesbah Elkaffas, Saleh, Editorial Board Member, Bozonnet, Emmanuel, Editorial Board Member, Pignatta, Gloria, Editorial Board Member, Mahgoub, Yasser, Editorial Board Member, De Bonis, Luciano, Editorial Board Member, Kostopoulou, Stella, Editorial Board Member, Pradhan, Biswajeet, Editorial Board Member, Abdul Mannan, Md., Editorial Board Member, Alalouch, Chaham, Editorial Board Member, Gawad, Iman O., Editorial Board Member, Nayyar, Anand, Editorial Board Member, Amer, Mourad, Series Editor, Sergi, Bruno S., editor, Popkova, Elena G., editor, Ostrovskaya, Anna A., editor, Chursin, Alexander A., editor, and Ragulina, Yulia V., editor
- Published
- 2024
- Full Text
- View/download PDF
18. A Hybrid Ensemble Machine Learning Approach (EHML) for DDOS Attack Detection in Smart City Network Traffic
- Author
-
Mante (Khurpade), Jyoti, Patil, Prerna, Dhotay, Megha, Budhavale, Shilpa, Kulkarni, Anand J., editor, and Cheikhrouhou, Naoufel, editor
- Published
- 2024
- Full Text
- View/download PDF
19. Exploring the interrelationships between composition, rheology, and compressive strength of self-compacting concrete: An exploration of explainable boosting algorithms
- Author
-
Sarmed Wahab, Babatunde Abiodun Salami, Ali H. AlAteah, Mohammed M.H. Al-Tholaia, and Turki S. Alahmari
- Subjects
Self-compacting concrete ,Compressive strength ,Machine learning ,Ensemble algorithms ,Explainable boosting machine ,Materials of engineering and construction. Mechanics of materials ,TA401-492 - Abstract
This study introduces a novel methodology for enhancing the compressive strength of self-compacting concrete (SCC) via the use of the Explainable Boosting Machine (EBM), a sophisticated and interpretable machine learning algorithm. It presents a data-driven model that aims to accurately predict the strength of SCC by considering the intricate interactions among its various elements. Additionally, the model provides insights into the variables that influence SCC's compressive strength. By using EBM in conjunction with XGBoost and CatBoost algorithms, this study conducts a comparative examination of predictive abilities using datasets related to composition and rheology. The findings reveal that CatBoost has greater predictive performance using rheology dataset, as shown by an R2 value of 0.977. Conversely, XGBoost exhibits a higher predictive capability using the composition dataset, as indicated by an R2 value of 0.947. The EBM can provide comprehensive explanations at both global and local levels. It effectively identifies the key factors that have a significant influence on compressive strength. These factors include the coarse aggregate content, cement content, water content, viscosity, and V-funnel flow time. The study findings provide more evidence to support the notion that including rheological data into the model leads to a notable improvement in its accuracy. This outcome further confirms the existence of a direct correlation between rheological properties and compressive strength. The explanatory insights provided by EBM give practical instructions for customising SCC mixes to attain desired strengths. This facilitates quality control and enables personalised concrete design in the field of construction. This study highlights the potential of interpretable machine learning algorithms in improving the predictive modelling of SCC features. This advancement may lead to the development of more durable, efficient, and customised building materials.
- Published
- 2024
- Full Text
- View/download PDF
20. Research on Hard Rock Pillar Stability Prediction Based on SABO-LSSVM Model
- Author
-
Xuebin Xie and Huaxi Zhang
- Subjects
pillar stability prediction ,SABO-LSSVM ,ensemble algorithms ,oversampling algorithm ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The increase in mining depth necessitates higher strength requirements for hard rock pillars, making mine pillar stability analysis crucial for pillar design and underground safety operations. To enhance the accuracy of predicting the stability state of mine pillars, a prediction model based on the subtraction-average-based optimizer (SABO) for hyperparameter optimization of the least-squares support vector machine (LSSVM) is proposed. First, by analyzing the redundancy of features in the mine pillar dataset and conducting feature selection, five parameter combinations were constructed to examine their effects on the performance of different models. Second, the SABO-LSSVM prediction model was compared vertically with classic models and horizontally with other optimized models to ensure comprehensive and objective evaluation. Finally, two data sampling methods and a combined sampling method were used to correct the bias of the optimized model for different categories of mine pillars. The results demonstrated that the SABO-LSSVM model exhibited good accuracy and comprehensive performance, thereby providing valuable insights for mine pillar stability prediction.
- Published
- 2024
- Full Text
- View/download PDF
21. Classification of water subscribers by machine learning algorithms.
- Author
-
Dahesh, Arezoo, Tavakkoli‐Moghaddam, Reza, Tajally, AmirReza, Erfani‐Jazi, Aseman, and Babazadeh‐Behestani, Milad
- Subjects
MACHINE learning ,RESIDENTIAL water consumption ,RANDOM forest algorithms ,WATER consumption ,WATER shortages ,SUPPORT vector machines - Abstract
The problem of water scarcity and water crisis (e.g., stable water resources, reduced rainfall, increased urban population growth and lack of proper management of water consumption in urban and domestic water) has recently become a significant issue. Therefore, examining the behaviour of Tehran Province Water and Wastewater (TPWW) subscribers to identify high‐consumption subscribers and explain methods to encourage and educate them more about the correct water consumption pattern can help deal with this crisis. This study aims to use machine learning algorithms to build a robust model for the classification of subscribers in Tehran. Then, new subscribers can be classified into similar classes. For this reason, ensemble algorithms, support vector machines and neural networks are used to predict subscribers' behaviour. Then, the random forest algorithm from the set of ensemble algorithms is considered the best model for the TPWW case with 99% and 98% in train and test accuracy, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Machine Learning For Groundwater Quality Classification: A Step Towards Economic and Sustainable Groundwater Quality Assessment Process.
- Author
-
Zegaar, Aymen, Ounoki, Samira, and Telli, Abdelmoutia
- Subjects
GROUNDWATER quality ,MACHINE learning ,IRRIGATION water quality ,MULTILAYER perceptrons ,WATER quality - Abstract
Evaluation of water quality is essential for protecting both the environment and human wellbeing. There is a paucity of research on using machine learning for classification of groundwater used for irrigation with fewer input parameters and still getting satisfactory results, despite earlier studies exploring its application in evaluating water quality. Studies are required to determine the feasibility of using machine learning to classify groundwater used for irrigation using minimal input parameters. In this study, we developed machine learning models to simulate the Irrigation Water Quality Index (IWQI) and an economic model that used an optimal number of inputs with the highest possible accuracy. We utilized eight classification algorithms, including the LightGBM classifier, CatBoost, Extra Trees, Random Forest, Gradient Boosting classifiers, Support Vector Machines, Multi-Layer Perceptrons, and K-Nearest Neighbors Algorithm. Two scenarios were considered, the first using six inputs, including conductivity, chloride ( Cl - ), bicarbonate ( HCO 3 - ), sodium ( Na + ), calcium ( Ca 2 + ), and magnesium ( Mg 2 + ), and the second using three parameters, including total hardness (TH), chloride ( Cl - ), and sulfate ( SO 4 2 - ) that were selected based on the Mutual Information (MI) result. The models achieved satisfactory performance, with the LightGBM classifier as the best model, yielding a 91.08% F1 score using six inputs, and the Extra Trees classifier as the best model, yielding an 86.30% F1 score using three parameters. Our findings provide a valuable contribution to the development of accurate and efficient machine learning models for water quality evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A Novel Computer-Aided Diagnostic System for Alzheimer's Diagnosis Using Variational Mode Decomposition Method.
- Author
-
Aslan, Zülfikar
- Subjects
- *
ALZHEIMER'S disease , *COMPUTER-aided diagnosis , *DECOMPOSITION method , *PRINCIPAL components analysis , *EARLY diagnosis - Abstract
Alzheimer's disease (AD) is a significant neurological disorder with deficits in cognitive and behavioral brain functions. Although there is no cure for AD, early diagnosis is essential in slowing the disease and increasing the patient's quality of life. In addition, the diagnosis of the disease includes costly tests and a complex process that an experienced specialist must evaluate. Therefore, this study presents a new computer-aided diagnosis system (CAD) allowing automatic AD diagnosis by EEG signals. The present study used EEG recordings of 24 healthy controls and 24 AD patients. The proposed algorithm includes a preprocessing step using multi-scale principal component analysis (MSPCA) for noise removal, decomposition of the signal into subcomponents with the variational mode decomposition (VMD) method, and extraction of statistical features from each subcomponent. The achievement of the recommended method in distinguishing between healthy individuals and AD patients was tested by applying various ensemble learning techniques and decomposition methods. As a result of the empirical studies, the maximum classification accuracy of AD diagnosis was obtained as 98.42 ± 0.06 using the Rotation Forest algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Ligand Based Virtual Screening ofMolecular Compounds in Drug Discovery Using GCAN Fingerprint and EnsembleMachine Learning Algorithm.
- Author
-
Ani, R., Deepa, O. S., and Manju, B. R.
- Subjects
DRUG discovery ,MACHINE learning ,CHEMICALS ,SUPPORT vector machines ,RANDOM forest algorithms - Abstract
The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein. The use of virtual screening in pharmaceutical research is growing in popularity. During the early phases of medication research and development, it is crucial. Chemical compound searches are nowmore narrowly targeted. Because the databases containmore andmore ligands, thismethod needs to be quick and exact. Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint (ECFP). Only the largest sub-graph is taken into consideration to learn the representation, despite the fact that the conventional graph network generates a better-encoded fingerprint. When using the average or maximum pooling layer, it also contains unrelated data. This article suggested the Graph Convolutional Attention Network (GCAN), a graph neural network with an attention mechanism, to address these problems. Additionally, it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant. The generated fingerprint is used to classify drugs using ensemble learning. As base classifiers, ensemble stacking is applied to Support Vector Machines (SVM), Random Forest, Nave Bayes, Decision Trees, AdaBoost, and Gradient Boosting. When compared to existing models, the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy, sensitivity, specificity, and area under the curve. Additionally, it is revealed that our ensemble learning with generated molecular fingerprint yields 91% accuracy, outperforming earlier approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Review on Fetal Health Classification
- Author
-
Nagabotu, Vimala, Namburu, Anupama, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, and Uddin, Mohammad Shorif, editor
- Published
- 2023
- Full Text
- View/download PDF
26. Instructor-assisted question classification system using machine learning algorithms with N-gram and weighting schemes
- Author
-
Delali Kwasi Dake, Edward Nwiah, Griffith Selorm Klogo, and Wisdom Xornam Ativi
- Subjects
Supervised algorithms ,Ensemble algorithms ,Natural language processing ,Text classification ,Computational linguistics. Natural language processing ,P98-98.5 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract One aspect of natural language processing, text classification, has become necessary in the educational domain due to the increasing number of students and the COVID-19 outbreak. The advent of the devastating pandemic and the need to remain safe have surged the discussions around online learning and integrated modules in teaching and learning. In this study, we employed machine learning to develop an automatic instructor-assisted question classification module for learning management systems. In selecting the best classifier, the conventional and the ensemble machine learning algorithms were compared using the tenfold and the fivefold cross-validation techniques. In addition, the N-gram feature selection mechanism and three weighting schemes were evaluated for performance enhancement. The detailed analysis indicates that the ensemble algorithms outperform the conventional ones with decreasing accuracy as the N-gram size increases. For all compared algorithms, the AdaBoost (SVM) ensemble algorithm has the highest accuracy of 78.55% for Unigram (TP, TF, TF-IDF). In addition, the AdaBoost (SVM) emerged with the highest F1-score of 0.782, whiles the ensemble Bagging (RF) algorithm had the highest ROC value of 0.955 for Unigram (TP).
- Published
- 2023
- Full Text
- View/download PDF
27. Novel Deep Hybrid and Ensemble Algorithms for Improving GPS Navigation Positioning Accuracy
- Author
-
Tolga Aydin and Ebru Erdem
- Subjects
Ensemble algorithms ,GPS ,GPSCNNs ,GPSLSTM ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
GPS (Global Positioning System) has been a widespread system used for various purposes in today’s world and it is essential to suggest innovative effective solutions to improve its use and functions. The present study proposes GPS coordinate conversion models based on Machine Learning (ML) and Deep Learning (DL) algorithms in order to “improve accuracy of GPS conversion and positioning services”. 23 different models are tested on two different data sets to achieve this purpose. The study primarily aims to improve positioning accuracy of navigation systems by using GPS data through hybrid and ensemble algorithms. The proposed DL-based models are named as GPSCNNs and GPSLSTM. GPSCNNs contain “Xception, VGG16, VGG19, Alexnet, CNN1, CNN2, CNN3” deep learning algorithms in their structure. Of these algorithms, “Xception, VGG16, VGG19, Alexnet” are pre-trained models. “CNN1” consists of 2 Convolution, 2 Average Pool, 1 Flatten, and 5 Dense layers. “CNN2” consists of 1 Convolution, 1 Max Pool, 1 Flatten, and 4 Dense layers. “CNN3” consists of 4 Convolution, 4 Batch Normalization, 2 Max Pool, 1 Flatten, and 3 Dense layers. GPSLSTM contains 1 LSTM and 1 Dense layer in its structure. Raw GPS data are fed into the models as input, which was followed by obtaining information about the features of the data and getting coordinate data as input. The results show that ensemble models provide the most accurate positioning and GPSCNNs and GPSLSTM were quite promising in boosting this accuracy.
- Published
- 2023
- Full Text
- View/download PDF
28. Instructor-assisted question classification system using machine learning algorithms with N-gram and weighting schemes.
- Author
-
Dake, Delali Kwasi, Nwiah, Edward, Klogo, Griffith Selorm, and Ativi, Wisdom Xornam
- Subjects
MACHINE learning ,NATURAL language processing ,COVID-19 pandemic ,LEARNING modules ,ONLINE education - Abstract
One aspect of natural language processing, text classification, has become necessary in the educational domain due to the increasing number of students and the COVID-19 outbreak. The advent of the devastating pandemic and the need to remain safe have surged the discussions around online learning and integrated modules in teaching and learning. In this study, we employed machine learning to develop an automatic instructor-assisted question classification module for learning management systems. In selecting the best classifier, the conventional and the ensemble machine learning algorithms were compared using the tenfold and the fivefold cross-validation techniques. In addition, the N-gram feature selection mechanism and three weighting schemes were evaluated for performance enhancement. The detailed analysis indicates that the ensemble algorithms outperform the conventional ones with decreasing accuracy as the N-gram size increases. For all compared algorithms, the AdaBoost (SVM) ensemble algorithm has the highest accuracy of 78.55% for Unigram (TP, TF, TF-IDF). In addition, the AdaBoost (SVM) emerged with the highest F1-score of 0.782, whiles the ensemble Bagging (RF) algorithm had the highest ROC value of 0.955 for Unigram (TP). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Response Spectrum Analysis of Multi-Story Shear Buildings Using Machine Learning Techniques.
- Author
-
Georgioudakis, Manolis and Plevris, Vagelis
- Subjects
SPECTRUM analysis ,MACHINE learning ,STRUCTURAL engineering ,CIVIL engineering ,STRUCTURAL engineers ,MODAL analysis - Abstract
The dynamic analysis of structures is a computationally intensive procedure that must be considered, in order to make accurate seismic performance assessments in civil and structural engineering applications. To avoid these computationally demanding tasks, simplified methods are often used by engineers in practice, to estimate the behavior of complex structures under dynamic loading. This paper presents an assessment of several machine learning (ML) algorithms, with different characteristics, that aim to predict the dynamic analysis response of multi-story buildings. Large datasets of dynamic response analyses results were generated through standard sampling methods and conventional response spectrum modal analysis procedures. In an effort to obtain the best algorithm performance, an extensive hyper-parameter search was elaborated, followed by the corresponding feature importance. The ML model which exhibited the best performance was deployed in a web application, with the aim of providing predictions of the dynamic responses of multi-story buildings, according to their characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Implementing XGBoost Machine Learning Ensemble Algorithm to Predict Contact Pressure of Two 3D Bodies
- Author
-
Orlov, Stepan, Aubekerov, Kairzhan, Koptsev, Stanislav, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Jahn, Carlos, editor, Ungvári, László, editor, and Ilin, Igor, editor
- Published
- 2022
- Full Text
- View/download PDF
31. EIDIMA: Edge-based Intrusion Detection of IoT Malware Attacks using Decision Tree-based Boosting Algorithms
- Author
-
Santhadevi, D., Janet, B., Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Satyanarayana, Ch., editor, Samanta, Debasis, editor, Gao, Xiao-Zhi, editor, and Kapoor, Rajiv Kumar, editor
- Published
- 2022
- Full Text
- View/download PDF
32. Exploring the Performance of Ensemble Machine Learning Classifiers for Sentiment Analysis of COVID-19 Tweets
- Author
-
Rahman, Md. Mahbubar, Islam, Muhammad Nazrul, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Shakya, Subarna, editor, Balas, Valentina Emilia, editor, Kamolphiwong, Sinchai, editor, and Du, Ke-Lin, editor
- Published
- 2022
- Full Text
- View/download PDF
33. Applying Prediction Models Based on Ensemble Machine Learning Algorithms to Estimate Resource Requirements at Healthcare Centers.
- Author
-
Hernández, Carlos and Leal, Paola
- Subjects
MEDICAL centers ,MACHINE learning ,ALGORITHMS ,ARTIFICIAL neural networks ,PUBLIC hospitals ,PREDICTION models - Abstract
Either a high complexity hospital or a smaller clinic, healthcare centers have to withstand the constant pressure of the incoming flow of new patients. While some patients require a simple medical procedure, others will need further examination and probably have remain in observation for some time. This situation is particularly complicated in times of sanitary crisis. Since the infrastructure, supplies, and medical staff are limited resources, there is a real need for utilizing them efficiently. This research is focused on the use of ensemble machine learning algorithms to develop models for predicting the destination of patients who are discharge after a stay at an intensive care unit (ICU). The investigation was carried out following a 4-phase methodology: analysis, design, development, and validation. During the analysis, an extensive review and preprocessing of patient records collected from a public hospital was carried out. Then, during the design several ensemble machine learning algorithms were compared and selected for the investigation. To name a few: Linear Regression, Decision Tree, Stacking, Bagging, and Random Forest. The following phases, development and validation were completed using data processing software. In all models proposed here, instead of a simple hold-out, a 10-fold cross-validation scheme was applied. For the purposes of this research, twenty thousand patient records collected in 2020 were considered. The complete dataset was split in two subsets. One subset for training and test with 80% of the data and another dataset for validation with the remaining 20%. During the development of the models, only data for training and for test were used. The validation data were used only to measure the models performance with unseen data. Results revealed that regardless the size of the training and test dataset, there was a notorious consistency in the correct prediction rates. The proposed ensemble scheme made of three base learner plus a meta algorithm, systematically leaded to correct prediction rates close to 82%. In conclusion, the proposed models proved that, with based on the existing data, high rates of correct prediction can be achieved when an ensemble scheme is used. In this case, with a reasonable certainty, it was possible to predict whether a patient was going to be referred to another unit or sent home after his or her stay at ICU. [ABSTRACT FROM AUTHOR]
- Published
- 2022
34. INCORPORATING DENSITY IN K-NEAREST NEIGHBORS REGRESSION.
- Author
-
Mahfouz, Mohamed A.
- Subjects
REGRESSION analysis ,DENSITY ,K-nearest neighbor classification - Abstract
The application of the traditional k-nearest neighbours in regression analysis suffers from several difficulties when only a limited number of samples are available. In this paper, two decision models based on density are proposed. In order to reduce testing time, a k-nearest neighbours table (kNN-Table) is maintained to keep the neighbours of each object x along with their weighted Manhattan distance to x and a binary vector representing the increase or the decrease in each dimension compared to x's values. In the first decision model, if the unseen sample having a distance to one of its neighbours x less than the farthest neighbour of x's neighbour then its label is estimated using linear interpolation otherwise linear extrapolation is used. In the second decision model, for each neighbour x of the unseen sample, the distance of the unseen sample to x and the binary vector are computed. Also, the set S of nearest neighbours of x are identified from the kNN-Table. For each sample in S, a normalized distance to the unseen sample is computed using the information stored in the kNN-Table and it is used to compute the weight of each neighbor of the neighbors of the unseen object. In the two models, a weighted average of the computed label for each neighbour is assigned to the unseen object. The diversity between the two proposed decision models and the traditional kNN regressor motivates us to develop an ensemble of the two proposed models along with traditional kNN regressor. The ensemble is evaluated and the results showed that the ensemble achieves significant increase in the performance compared to its base regressors and several related algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Remote Interference Discrimination Testbed Employing AI Ensemble Algorithms for 6G TDD Networks.
- Author
-
Zhang, Hanzhong, Zhou, Ting, Xu, Tianheng, and Hu, Honglin
- Subjects
- *
ALGORITHMS , *ARTIFICIAL intelligence , *MACHINE-to-machine communications , *INTERNET of things - Abstract
The Internet-of-Things (IoT) massive access is a significant scenario for sixth-generation (6G) communications. However, low-power IoT devices easily suffer from remote interference caused by the atmospheric duct under the 6G time-division duplex (TDD) mode. It causes distant downlink wireless signals to propagate beyond the designed protection distance and interfere with local uplink signals, leading to a large outage probability. In this paper, a remote interference discrimination testbed is originally proposed to detect interference, which supports the comparison of different types of algorithms on the testbed. Specifically, 5,520,000 TDD network-side data collected by real sensors are used to validate the interference discrimination capabilities of nine promising AI algorithms. Moreover, a consistent comparison of the testbed shows that the ensemble algorithm achieves an average accuracy of 12% higher than the single model algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. A Comparative Study on Breast Cancer Tissues Using Conventional and Modern Machine Learning Models
- Author
-
Lakshmi, D., Gurrela, Srinivas Reddy, Kuncharam, Manideep, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Satapathy, Suresh Chandra, editor, Bhateja, Vikrant, editor, Favorskaya, Margarita N., editor, and Adilakshmi, T., editor
- Published
- 2021
- Full Text
- View/download PDF
37. BoostedEnML: Efficient Technique for Detecting Cyberattacks in IoT Systems Using Boosted Ensemble Machine Learning.
- Author
-
Okey, Ogobuchi Daniel, Maidin, Siti Sarah, Adasme, Pablo, Lopes Rosa, Renata, Saadi, Muhammad, Carrillo Melgarejo, Dick, and Zegarra Rodríguez, Demóstenes
- Subjects
- *
BOTNETS , *CYBERTERRORISM , *MACHINE learning , *INTERNET of things , *RANDOM forest algorithms , *PLURALITY voting - Abstract
Following the recent advances in wireless communication leading to increased Internet of Things (IoT) systems, many security threats are currently ravaging IoT systems, causing harm to information. Considering the vast application areas of IoT systems, ensuring that cyberattacks are holistically detected to avoid harm is paramount. Machine learning (ML) algorithms have demonstrated high capacity in helping to mitigate attacks on IoT devices and other edge systems with reasonable accuracy. However, the dynamics of operation of intruders in IoT networks require more improved IDS models capable of detecting multiple attacks with a higher detection rate and lower computational resource requirement, which is one of the challenges of IoT systems. Many ensemble methods have been used with different ML classifiers, including decision trees and random forests, to propose IDS models for IoT environments. The boosting method is one of the approaches used to design an ensemble classifier. This paper proposes an efficient method for detecting cyberattacks and network intrusions based on boosted ML classifiers. Our proposed model is named BoostedEnML. First, we train six different ML classifiers (DT, RF, ET, LGBM, AD, and XGB) and obtain an ensemble using the stacking method and another with a majority voting approach. Two different datasets containing high-profile attacks, including distributed denial of service (DDoS), denial of service (DoS), botnets, infiltration, web attacks, heartbleed, portscan, and botnets, were used to train, evaluate, and test the IDS model. To ensure that we obtained a holistic and efficient model, we performed data balancing with synthetic minority oversampling technique (SMOTE) and adaptive synthetic (ADASYN) techniques; after that, we used stratified K-fold to split the data into training, validation, and testing sets. Based on the best two models, we construct our proposed BoostedEnsML model using LightGBM and XGBoost, as the combination of the two classifiers gives a lightweight yet efficient model, which is part of the target of this research. Experimental results show that BoostedEnsML outperformed existing ensemble models in terms of accuracy, precision, recall, F-score, and area under the curve (AUC), reaching 100% in each case on the selected datasets for multiclass classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Mitigating false negatives in imbalanced datasets: An ensemble approach.
- Author
-
Vasconcelos, Marcelo and Cavique, Luís
- Subjects
- *
FRAUD investigation , *DIAGNOSIS , *ALGORITHMS , *CLASSIFICATION , *FORECASTING - Abstract
• Addressing imbalanced data in ML poses challenges due to class disproportion. • In some imbalanced datasets, false negatives impact more than false positives. • This work introduces the MinFNR algorithm to minimize False Negative Rates (FNR). • The new algorithm strategically combines data, algorithmic, and hybrid approaches. Imbalanced datasets present a challenge in machine learning, especially in binary classification scenarios where one class significantly outweighs the other. This imbalance often leads to models favoring the majority class, resulting in inadequate predictions for the minority class, specifically in false negatives. In response to this issue, this work introduces the MinFNR ensemble algorithm, designed to minimize False Negative Rates (FNR) in imbalanced datasets. The new approach strategically combines data-level, algorithmic-level, and hybrid-level approaches to enhance overall predictive capabilities while minimizing computational resources using the Set Covering Problem (SCP) formulation. Through a comprehensive evaluation of diverse datasets, MinFNR consistently outperforms individual algorithms, showing its potential for applications where the cost of false negatives is substantial, such as fraud detection and medical diagnosis. This work also contributes to ongoing efforts to improve the reliability and effectiveness of machine learning algorithms in real imbalanced scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
39. Discrimination among Winding Mechanical Defects in Transformer Using Noise Detection and Data Mining Boosting Method
- Author
-
Zahra Moravej, Seyed Mahmood Mortazavi, and Mojtaba Mohseni
- Subjects
frequency response ,ensemble algorithms ,decision tree ,local outlier factor ,Electronics ,TK7800-8360 ,Industry ,HD2321-4730.9 - Abstract
IIn this paper, an efficient method to detect and discriminate mechanical defects of transformer winding based on extracting the winding frequency responses using outlier data detection and ensemble algorithms ,which in total constitutes an efficient hybrid method has been proposed. First, the frequency response of the high voltage winding of a real model of transformer (1.6 MVA) was extracted in different condition and arranged as primary data. Then, due to the high standard deviation of the characteristics and the weight of the outlier samples above the threshold of 1.1, the Local Outlier Factor (LOF) method was used to clean the samples. Finally, data mining algorithms have been used to detect and distinguish mechanical defects. Based on the results, the decision tree bagging ensemble method reported the best accuracy compared to other techniques and improved the accuracy of the decision tree with total accuracy of 92.68% by LOF. These results also showed that all methods improved accuracy by LOF. Therefore, it can be claimed that the proposed method has the ability to discriminate the mechanical defects of the transformer winding with appropriate accuracy.
- Published
- 2021
- Full Text
- View/download PDF
40. Response Spectrum Analysis of Multi-Story Shear Buildings Using Machine Learning Techniques
- Author
-
Manolis Georgioudakis and Vagelis Plevris
- Subjects
response spectrum analysis ,ensemble algorithms ,machine learning ,shear building ,SHAP explainability ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The dynamic analysis of structures is a computationally intensive procedure that must be considered, in order to make accurate seismic performance assessments in civil and structural engineering applications. To avoid these computationally demanding tasks, simplified methods are often used by engineers in practice, to estimate the behavior of complex structures under dynamic loading. This paper presents an assessment of several machine learning (ML) algorithms, with different characteristics, that aim to predict the dynamic analysis response of multi-story buildings. Large datasets of dynamic response analyses results were generated through standard sampling methods and conventional response spectrum modal analysis procedures. In an effort to obtain the best algorithm performance, an extensive hyper-parameter search was elaborated, followed by the corresponding feature importance. The ML model which exhibited the best performance was deployed in a web application, with the aim of providing predictions of the dynamic responses of multi-story buildings, according to their characteristics.
- Published
- 2023
- Full Text
- View/download PDF
41. Ensemble multi-objective optimization approach for heterogeneous drone delivery problem.
- Author
-
Wen, Xupeng, Wu, Guohua, Li, Shuanglin, and Wang, Ling
- Subjects
- *
DRONE aircraft delivery , *CUSTOMER satisfaction , *GENETIC algorithms , *SATISFACTION , *EVOLUTIONARY algorithms , *FACTOR analysis - Abstract
Recently, driven by advancements in the payload capacity and endurance of drones, the logistics industry has shown significant interest in drone Last-Mile logistics. Efficient routing are crucial scientific challenges in drone delivery problems. In this study, we address the routing problem in heterogeneous drone delivery, which involves a large drone transporting multiple small drones to sub-regions for parcel delivery, aiming to both reduce the drones' distance costs and improve customer satisfaction, termed HDDPBO To tackle the HDDPBO problem effectively, we propose a voting-based ensemble multi-objective genetic approach, named VEMOGA, in which an improved clustering algorithm is developed to divide customers into K clusters, enabling each drone to handle multiple parcel deliveries within a sub-region. In this way, it reduces the complexity of HDDPBO by transforming it into multiple sub-problems. Secondly, a multi-objective genetic approach with heuristic operators is proposed to explore high-quality solutions, in which customized crossover and mutation operators are designed in the genetic approach, and a voting-based ensemble algorithm is designed to robustly select the Pareto frontier with high-quality convergence and diversity. Extensive experiments are conducted on synthetic instances to evaluate the proposed algorithm, and the experimental results demonstrate superior performance compared to three other baselines. Additionally, a real-world instance has been scrutinized to ascertain the applicability of Last-Mile logistics, and sensitivity analyses of pivotal factors have been conducted and several managerial insights pertinent are given to the drone-based Last-Mile logistics. • A multi-objective model is proposed for heterogeneous drone logistics. • A voting-based ensemble multi-objective genetic algorithm is proposed. • The proposed algorithm achieves higher satisfaction and lower costs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Crop Prediction Based on Environmental Factors Using Machine Learning Ensemble Algorithms
- Author
-
Ashok, Tatapudi, Suresh Varma, P., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Peng, Sheng-Lung, editor, Son, Le Hoang, editor, Suseendran, G., editor, and Balaganesh, D., editor
- Published
- 2020
- Full Text
- View/download PDF
43. Predicting clinical trial outcomes using drug bioactivities through graph database integration and machine learning.
- Author
-
Murali, Vidhya, Muralidhar, Y. Pradyumna, Königs, Cassandra, Nair, Meera, Madhu, Sethulekshmi, Nedungadi, Prema, Srinivasa, Gowri, and Athri, Prashanth
- Subjects
- *
CLINICAL trials , *TREATMENT effectiveness , *DRUG utilization , *RANDOM forest algorithms , *DRUG approval , *MACHINE learning , *NATURAL language processing - Abstract
The ability to estimate the probability of a drug to receive approval in clinical trials provides natural advantages to optimizing pharmaceutical research workflows. Success rates of clinical trials have deep implications for costs, duration of development, and under pressure due to stringent regulatory approval processes. We propose a machine learning approach that can predict the outcome of the trial with reliable accuracies, using biological activities, physicochemical properties of the compounds, target‐related features, and NLP‐based compound representation. In the above list, biological activities have never been used as an independent variable towards the prediction of clinical trial outcomes. We have extracted the drug–disease pair from clinical trials and mapped target(s) to that pair using multiple data sources. Empirical results demonstrate that ensemble learning outperforms independently trained, small‐data ML models. We report results and inferences derived from a Random forest classifier with an average accuracy of 93%, and an F1 score of 0.96 for the "Pass" class. "Pass" refers to one of the two classes (Pass/Fail) of all clinical trials, and the model performed well in predicting the "Pass" category. Through the analysis of feature contributions to predictive capability, we have demonstrated that bioactivity plays a statistically significant role in predicting clinical trial outcome. A significant effort has gone into the production of the dataset that, for the first time, integrates clinical trial information with protein targets. Cleaned, organized, integrated data and code to map these entities, created as a part of this work, are available open‐source. This reproducibility and the freely available code ensure that researchers with access to deep curated and proprietary clinical trial databases (we only use open‐source data in this study) can further expand the scope of the results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Prediction of Sinter Chemical Composition Based on Ensemble Learning Algorithms
- Author
-
Niu, Lele, Liu, Zhengjian, Zhang, Jianliang, Sun, Qingke, Schenk, Johannes, Wang, Jiabao, and Wang, Yaozu
- Published
- 2023
- Full Text
- View/download PDF
45. Comparing the Accuracy of Prediction Models based on Ensemble Machine Learning Schemes.
- Author
-
Hernández, Carlos and Alvar, Álvaro
- Subjects
MACHINE learning ,PREDICTION models ,HONEY ,SUPPORT vector machines - Abstract
This research analyzes the influence of the configuration of ensemble learning algorithms' accuracy when predicting the annual production of honey for export in the south of Chile. The research is carried out following a classic 4-stage methodology (analysis, design, development, and validation). During the analysis, data is gathered and preprocessed. During the design, independent variables, ensemble algorithms, and performance metrics (correlation coefficient, MAE and RMSE) are defined. Construction and validation are carried out using the software WEKA. To build the models, 9 variables are considered. The dataset is split up in a subset for training and test (80%) and another one for validation (20%). The predictions are obtained by means of configuring a stacking scheme as ensemble and interchanging a support vector machine, a linear regression, a decision tree, and a Gaussian process as meta or base learners. According to the results, while the correlation coefficient between predictions and actual values fluctuates significantly in the range of 18% to 46%, MAE does it between 32% and 37%. In conclusion, although being inaccurate, results suggests that the arrangement of the meta and base algorithms within the ensemble does affect the prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
46. Applying Ensemble Machine Learning Algorithms to Predict Professional Career Development Preferences among University Students.
- Author
-
Paiva, Galo and Hernández, Carlos
- Subjects
MACHINE learning ,COLLEGE students ,CAREER development ,PROFESSIONAL education ,ALGORITHMS - Abstract
This research is focused on the development and comparison of models based on ensemble machine learning algorithms to predict professional career development preferences among students five years after their graduation using the results of the University Entrepreneurial Spirit Students' Survey 2018. The research is carried out following a classic 4-stage methodology (analysis, design, development, and validation). During the analysis, surveys are thoroughly reviewed and preprocessed. During the design, questions are grouped and combined to build 11 predictive models. Construction and validation are carried out entirely using the software WEKA. For the purposes of this investigation 1.121 surveys are considered. Initially the dataset is split up in a subset for training and test (80%) and a subset for validation (20%). The approach to predict students' mid-term career preferences comprised the use of an ensemble scheme (stacking) composed by a logistic regression as meta-model, and a decision tree, and a support vector machine as base models. Experimental results show that half the proposed models predict correctly around 77% of the surveys. In conclusion, ensemble models can be used to predict students' professional career development preferences. However, predictions' accuracy depends on the attribute selection. [ABSTRACT FROM AUTHOR]
- Published
- 2021
47. A Systematic Evaluation of Supervised Machine Learning Algorithms for Cell Phenotype Classification Using Single-Cell RNA Sequencing Data
- Author
-
Xiaowen Cao, Li Xing, Elham Majd, Hua He, Junhua Gu, and Xuekui Zhang
- Subjects
classification ,gene selection ,ensemble algorithms ,machine learning ,single-cell RNA sequencing ,supervised algorithms ,Genetics ,QH426-470 - Abstract
The new technology of single-cell RNA sequencing (scRNA-seq) can yield valuable insights into gene expression and give critical information about the cellular compositions of complex tissues. In recent years, vast numbers of scRNA-seq datasets have been generated and made publicly available, and this has enabled researchers to train supervised machine learning models for predicting or classifying various cell-level phenotypes. This has led to the development of many new methods for analyzing scRNA-seq data. Despite the popularity of such applications, there has as yet been no systematic investigation of the performance of these supervised algorithms using predictors from various sizes of scRNA-seq datasets. In this study, 13 popular supervised machine learning algorithms for cell phenotype classification were evaluated using published real and simulated datasets with diverse cell sizes. This benchmark comprises two parts. In the first, real datasets were used to assess the computing speed and cell phenotype classification performance of popular supervised algorithms. The classification performances were evaluated using the area under the receiver operating characteristic curve, F1-score, Precision, Recall, and false-positive rate. In the second part, we evaluated gene-selection performance using published simulated datasets with a known list of real genes. The results showed that ElasticNet with interactions performed the best for small and medium-sized datasets. The NaiveBayes classifier was found to be another appropriate method for medium-sized datasets. With large datasets, the performance of the XGBoost algorithm was found to be excellent. Ensemble algorithms were not found to be significantly superior to individual machine learning methods. Including interactions in the ElasticNet algorithm caused a significant performance improvement for small datasets. The linear discriminant analysis algorithm was found to be the best choice when speed is critical; it is the fastest method, it can scale to handle large sample sizes, and its performance is not much worse than the top performers.
- Published
- 2022
- Full Text
- View/download PDF
48. A Systematic Evaluation of Supervised Machine Learning Algorithms for Cell Phenotype Classification Using Single-Cell RNA Sequencing Data.
- Author
-
Cao, Xiaowen, Xing, Li, Majd, Elham, He, Hua, Gu, Junhua, and Zhang, Xuekui
- Subjects
SUPERVISED learning ,RNA sequencing ,MACHINE learning ,FISHER discriminant analysis ,PHENOTYPES ,RECEIVER operating characteristic curves - Abstract
The new technology of single-cell RNA sequencing (scRNA-seq) can yield valuable insights into gene expression and give critical information about the cellular compositions of complex tissues. In recent years, vast numbers of scRNA-seq datasets have been generated and made publicly available, and this has enabled researchers to train supervised machine learning models for predicting or classifying various cell-level phenotypes. This has led to the development of many new methods for analyzing scRNA-seq data. Despite the popularity of such applications, there has as yet been no systematic investigation of the performance of these supervised algorithms using predictors from various sizes of scRNA-seq datasets. In this study, 13 popular supervised machine learning algorithms for cell phenotype classification were evaluated using published real and simulated datasets with diverse cell sizes. This benchmark comprises two parts. In the first, real datasets were used to assess the computing speed and cell phenotype classification performance of popular supervised algorithms. The classification performances were evaluated using the area under the receiver operating characteristic curve, F1-score, Precision, Recall, and false-positive rate. In the second part, we evaluated gene-selection performance using published simulated datasets with a known list of real genes. The results showed that ElasticNet with interactions performed the best for small and medium-sized datasets. The NaiveBayes classifier was found to be another appropriate method for medium-sized datasets. With large datasets, the performance of the XGBoost algorithm was found to be excellent. Ensemble algorithms were not found to be significantly superior to individual machine learning methods. Including interactions in the ElasticNet algorithm caused a significant performance improvement for small datasets. The linear discriminant analysis algorithm was found to be the best choice when speed is critical; it is the fastest method, it can scale to handle large sample sizes, and its performance is not much worse than the top performers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Improving Sentiment Analysis for Social Media Applications Using an Ensemble Deep Learning Language Model.
- Author
-
Alsayat, Ahmed
- Subjects
- *
DEEP learning , *SENTIMENT analysis , *SOCIAL media , *COVID-19 pandemic , *USER-generated content , *CLASSIFICATION algorithms , *LANGUAGE models - Abstract
As data grow rapidly on social media by users' contributions, specially with the recent coronavirus pandemic, the need to acquire knowledge of their behaviors is in high demand. The opinions behind posts on the pandemic are the scope of the tested dataset in this study. Finding the most suitable classification algorithms for this kind of data is challenging. Within this context, models of deep learning for sentiment analysis can introduce detailed representation capabilities and enhanced performance compared to existing feature-based techniques. In this paper, we focus on enhancing the performance of sentiment classification using a customized deep learning model with an advanced word embedding technique and create a long short-term memory (LSTM) network. Furthermore, we propose an ensemble model that combines our baseline classifier with other state-of-the-art classifiers used for sentiment analysis. The contributions of this paper are twofold. (1) We establish a robust framework based on word embedding and an LSTM network that learns the contextual relations among words and understands unseen or rare words in relatively emerging situations such as the coronavirus pandemic by recognizing suffixes and prefixes from training data. (2) We capture and utilize the significant differences in state-of-the-art methods by proposing a hybrid ensemble model for sentiment analysis. We conduct several experiments using our own Twitter coronavirus hashtag dataset as well as public review datasets from Amazon and Yelp. For concluding results, a statistical study is carried out indicating that the performance of these proposed models surpasses other models in terms of classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. An efficient chaotic salp swarm optimization approach based on ensemble algorithm for class imbalance problems.
- Author
-
Gillala, Rekha, Vuyyuru, Krishna Reddy, Jatoth, Chandrashekar, and Fiore, Ugo
- Subjects
- *
ALGORITHMS , *FEATURE selection , *MATHEMATICAL optimization , *PARTICLE swarm optimization , *METAHEURISTIC algorithms , *SCIENTIFIC community - Abstract
Class imbalance problems have attracted the research community, but a few works have focused on feature selection with imbalanced datasets. To handle class imbalance problems, we developed a novel fitness function for feature selection using the chaotic salp swarm optimization algorithm, an efficient meta-heuristic optimization algorithm that has been successfully used in a wide range of optimization problems. This paper proposes an AdaBoost algorithm with chaotic salp swarm optimization. The most discriminating features are selected using salp swarm optimization, and AdaBoost classifiers are thereafter trained on the features selected. Experiments show the ability of the proposed technique to find the optimal features with performance maximization of AdaBoost. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.