Descriptor: "stacked generalization" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"stacked generalization"' showing total 260 results

Start Over Descriptor "stacked generalization"

260 results on '"stacked generalization"'

51. Stacking-Based Integrated Machine Learning with Data Reduction

Author: Czarnowski, Ireneusz, Jędrzejowicz, Piotr, Howlett, Robert James, Series editor, Jain, Lakhmi C., Series editor, Czarnowski, Ireneusz, editor, and Howlett, Robert J., editor
Published: 2018
Full Text: View/download PDF

52. Diagnosis of Inflammatory Bowel Disease and Colorectal Cancer through Multi-View Stacked Generalization Applied on Gut Microbiome Data

Author: Sultan Imangaliyev, Jörg Schlötterer, Folker Meyer, and Christin Seifert
Subjects: gut microbiome, machine learning, classification, inflammatory bowel disease, colorectal cancer, stacked generalization, Medicine (General), R5-920
Abstract: Most of the microbiome studies suggest that using ensemble models such as Random Forest results in best predictive power. In this study, we empirically evaluate a more powerful ensemble learning algorithm, multi-view stacked generalization, on pediatric inflammatory bowel disease and adult colorectal cancer patients’ cohorts. We aim to check whether stacking would lead to better results compared to using a single best machine learning algorithm. Stacking achieves the best test set Average Precision (AP) on inflammatory bowel disease dataset reaching AP = 0.69, outperforming both the best base classifier (AP = 0.61) and the baseline meta learner built on top of base classifiers (AP = 0.63). On colorectal cancer dataset, the stacked classifier also outperforms (AP = 0.81) both the best base classifier (AP = 0.79) and the baseline meta learner (AP = 0.75). Stacking achieves best predictive performance on test set outperforming the best classifiers on both patient cohorts. Application of the stacking solves the issue of choosing the most appropriate machine learning algorithm by automating the model selection procedure. Clinical application of such a model is not limited to diagnosis task only, but it also can be extended to biomarker selection thanks to feature selection procedure.
Published: 2022
Full Text: View/download PDF

53. Convolutional neural network-based ensemble methods to recognize Bangla handwritten character

Author: Mir Moynuddin Ahmed Shibly, Tahmina Akter Tisha, Tanzina Akter Tani, and Shamim Ripon
Subjects: Convolutional neural network, Ensemble learning, Bangla handwritten character recognition, Deep learning, Stacked generalization, Bootstrap aggregating, Electronic computers. Computer science, QA75.5-76.95
Abstract: In this era of advancements in deep learning, an autonomous system that recognizes handwritten characters and texts can be eventually integrated with the software to provide better user experience. Like other languages, Bangla handwritten text extraction also has various applications such as post-office automation, signboard recognition, and many more. A large-scale and efficient isolated Bangla handwritten character classifier can be the first building block to create such a system. This study aims to classify the handwritten Bangla characters. The proposed methods of this study are divided into three phases. In the first phase, seven convolutional neural networks i.e., CNN-based architectures are created. After that, the best performing CNN model is identified, and it is used as a feature extractor. Classifiers are then obtained by using shallow machine learning algorithms. In the last phase, five ensemble methods have been used to achieve better performance in the classification task. To systematically assess the outcomes of this study, a comparative analysis of the performances has also been carried out. Among all the methods, the stacked generalization ensemble method has achieved better performance than the other implemented methods. It has obtained accuracy, precision, and recall of 98.68%, 98.69%, and 98.68%, respectively on the Ekush dataset. Moreover, the use of CNN architectures and ensemble methods in large-scale Bangla handwritten character recognition has also been justified by obtaining consistent results on the BanglaLekha-Isolated dataset. Such efficient systems can move the handwritten recognition to the next level so that the handwriting can easily be automated.
Published: 2021
Full Text: View/download PDF

54. Predicting movie audience with stacked generalization by combining machine learning algorithms.

Author: Junghoon Park and Changwon Lim
Subjects: MACHINE learning, DATA mining, COMPUTER algorithms
Abstract: The Korea film industry has matured and the number of movie-watching per capita has reached the highest level in the world. Since then, movie industry growth rate is decreasing and even the total sales of movies per year slightly decreased in 2018. The number of moviegoers is the first factor of sales in movie industry and also an important factor influencing additional sales. Thus it is important to predict the number of movie audiences. In this study, we predict the cumulative number of audiences of films using stacking, an ensemble method. Stacking is a kind of ensemble method that combines all the algorithms used in the prediction. We use box office data from Korea Film Council and web comment data from Daum Movie (www.movie.daum.net). This paper describes the process of collecting and preprocessing of explanatory variables and explains regression models used in stacking. Final stacking model outperforms in the prediction of test set in terms of RMSE. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

55. CLG Authorship Analytics: a library for authorship verification

Author: Moreau, Erwan and Vogel, Carl
Published: 2022
Full Text: View/download PDF

56. Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction

Author: Abdollahi, Jafar and Nouri-Moghaddam, Babak
Published: 2022
Full Text: View/download PDF

57. Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme

Author: Kuan-Hsi Chen, Tsai-Feng Wang, and Yuh-Jyh Hu
Subjects: Protein-protein interaction, Stacked generalization, Gene ontology, Network topology, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Although various machine learning-based predictors have been developed for estimating protein–protein interactions, their performances vary with dataset and species, and are affected by two primary aspects: choice of learning algorithm, and the representation of protein pairs. To improve the performance of predicting protein–protein interactions, we exploit the synergy of multiple learning algorithms, and utilize the expressiveness of different protein-pair features. Results We developed a stacked generalization scheme that integrates five learning algorithms. We also designed three types of protein-pair features based on the physicochemical properties of amino acids, gene ontology annotations, and interaction network topologies. When tested on 19 published datasets collected from eight species, the proposed approach achieved a significantly higher or comparable overall performance, compared with seven competitive predictors. Conclusion We introduced an ensemble learning approach for PPI prediction that integrated multiple learning algorithms and different protein-pair representations. The extensive comparisons with other state-of-the-art prediction tools demonstrated the feasibility and superiority of the proposed method.
Published: 2019
Full Text: View/download PDF

58. LieToMe: An Ensemble Approach for Deception Detection from Facial Cues.

Author: Avola, Danilo, Cascio, Marco, Cinque, Luigi, Fagioli, Alessio, and Foresti, Gian Luca
Subjects: *DECEPTION, *BEHAVIOR, *LIE detectors & detection, *MAGNETIC resonance, *PEDESTRIANS, *FACE
Abstract: Deception detection is a relevant ability in high stakes situations such as police interrogatories or court trials, where the outcome is highly influenced by the interviewed person behavior. With the use of specific devices, e.g. polygraph or magnetic resonance, the subject is aware of being monitored and can change his behavior, thus compromising the interrogation result. For this reason, video analysis-based methods for automatic deception detection are receiving ever increasing interest. In this paper, a deception detection approach based on RGB videos, leveraging both facial features and stacked generalization ensemble, is proposed. First, a face, which is well-known to present several meaningful cues for deception detection, is identified, aligned, and masked to build video signatures. These signatures are constructed starting from five different descriptors, which allow the system to capture both static and dynamic facial characteristics. Then, video signatures are given as input to four base-level algorithms, which are subsequently fused applying the stacked generalization technique, resulting in a more robust meta-level classifier used to predict deception. By exploiting relevant cues via specific features, the proposed system achieves improved performances on a public dataset of famous court trials, with respect to other state-of-the-art methods based on facial features, highlighting the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

59. StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics.

Author: Chatzimparmpas, Angelos, Martins, Rafael M., Kucher, Kostiantyn, and Kerren, Andreas
Subjects: KEY performance indicators (Management), ALGORITHMS, VISUAL analytics, ACQUISITION of data, MACHINE learning
Abstract: In machine learning (ML), ensemble methods-such as bagging, boosting, and stacking-are widely-established approaches that regularly achieve top-notch predictive performance. Stacking (also called “stacked generalization”) is an ensemble method that combines heterogeneous base models, arranged in at least one layer, and then employs another metamodel to summarize the predictions of those models. Although it may be a highly-effective approach for increasing the predictive performance of ML, generating a stack of models from scratch can be a cumbersome trial-and-error process. This challenge stems from the enormous space of available solutions, with different sets of data instances and features that could be used for training, several algorithms to choose from, and instantiations of these algorithms using diverse parameters (i.e., models) that perform differently according to various metrics. In this work, we present a knowledge generation model, which supports ensemble learning with the use of visualization, and a visual analytics system for stacked generalization. Our system, StackGenVis, assists users in dynamically adapting performance metrics, managing data instances, selecting the most important features for a given data set, choosing a set of top-performant and diverse algorithms, and measuring the predictive performance. In consequence, our proposed tool helps users to decide between distinct models and to reduce the complexity of the resulting stack by removing overpromising and underperforming models. The applicability and effectiveness of StackGenVis are demonstrated with two use cases: a real-world healthcare data set and a collection of data related to sentiment/stance detection in texts. Finally, the tool has been evaluated through interviews with three ML experts. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

60. Multi-Aspect Oriented Sentiment Classification: Prior Knowledge Topic Modelling and Ensemble Learning Classifier Approach

Author: Najwa AlGhamdi, Shaheen Khatoon, and Majed Alshamari
Subjects: sentiment classification, prior knowledge, topic models, data labelling, ensemble learning, stacked generalization, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: User-generated content on numerous sites is indicative of users’ sentiment towards many issues, from daily food intake to using new products. Amid the active usage of social networks and micro-blogs, notably during the COVID-19 pandemic, we may glean insights into any product or service through users’ feedback and opinions. Thus, it is often difficult and time consuming to go through all the reviews and analyse them in order to recognize the notion of the overall goodness or badness of the reviews before making any decision. To overcome this challenge, sentiment analysis has been used as an effective rapid way to automatically gauge consumers’ opinions. Large reviews will possibly encompass both positive and negative opinions on different features of a product/service in the same review. Therefore, this paper proposes an aspect-oriented sentiment classification using a combination of the prior knowledge topic model algorithm (SA-LDA), automatic labelling (SentiWordNet) and ensemble method (Stacking). The framework is evaluated using the dataset from different domains. The results have shown that the proposed SA-LDA outperformed the standard LDA. In addition, the suggested ensemble learning classifier has increased the accuracy of the classifier by more than ~3% when it is compared to baseline classification algorithms. The study concluded that the proposed approach is equally adaptable across multi-domain applications.
Published: 2022
Full Text: View/download PDF

61. Mapping wetland using the object-based stacked generalization method based on multi-temporal optical and SAR data

Author: Yaotong Cai, Xinyu Li, Meng Zhang, and Hui Lin
Subjects: Wetland, Classification, Sentinel-1/2, Multi-Temporal, Object-Based, Stacked generalization, Physical geography, GB3-5030, Environmental sciences, GE1-350
Abstract: Wetland ecosystems have experienced dramatic challenges in the past few decades due to natural and human factors. Wetland maps are essential for the conservation and management of terrestrial ecosystems. This study is to obtain an accurate wetland map using an object-based stacked generalization (Stacking) method on the basis of multi-temporal Sentinel-1 and Sentinel-2 data. Firstly, the Robust Adaptive Spatial Temporal Fusion Model (RASTFM) is used to get time series Sentinel-2 NDVI, from which the vegetation phenology variables are derived by the threshold method. Subsequently, both vertical transmit-vertical receive (VV) and vertical transmit-horizontal receive (VH) polarization backscatters (σ0 VV, σ0 VH) are obtained using the time series Sentinel-1 images. Speckle noise inherent in SAR data, resulting in over-segmentation or under-segmentation, can affect image segmentation and degrade the accuracies of wetland classification. Therefore, we segment Sentinel-2 multispectral images to delineate meaningful objects in this study. Then, in order to reduce data redundancy and computation time, we analyze the optimal feature combination using the Sentinel-2 multispectral images, Sentinel-2 NDVI time series, phenological variables and other vegetation index derived from Sentinel-2 multispectral images, as well as time series Sentinel-1 backscatters at the object level. Finally, the stacked generalization algorithm is utilized to extract the wetland information based on the optimal feature combination in the Dongting Lake wetland. The overall accuracy and Kappa coefficient of the object-based stacked generalization method are 92.46% and 0.92, which are 3.88% and 0.04 higher than that using the pixel-based method. Moreover, the object-based stacked generalization algorithm is superior to single classifiers in classifying vegetation of high heterogeneity areas.
Published: 2020
Full Text: View/download PDF

62. A Multi-Tier Stacked Ensemble Algorithm to Reduce the Regret of Incremental Learning for Streaming Data

Author: R. Pari, M. Sandhya, and Sharmila Sankar
Subjects: Incremental learning, stream data mining, ensemble learning, stacked generalization, stacked ensemble and classification accuracy, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Incremental Learning (IL) is an exciting paradigm that deals with classification problems based on a streaming or sequential data. IL aims to achieve the same level of prediction accuracy on streaming data as that of a batch learning model that has the opportunity to see the entire data at once. The performance of the traditional algorithms that can learn the streaming data is nowhere comparable to that of batch learning algorithms. Reducing the regret of IL is a challenging task in real-world applications. Hence developing an innovative algorithm to improve the ILs performance is a necessity. In this paper, we propose a multi-tier stacked ensemble (MTSE) algorithm that uses incremental learners as the base classifiers. This novel algorithm uses the incremental learners to predict the results that get combined by the combination schemes in the next tier. The meta-learning in the next tier generalizes the output from the combination schemes to give the final prediction. We tested the MTSE with three data sets from the UCI machine learning repository. The results reveal that MTSE is superior in performance over the SE learning.
Published: 2018
Full Text: View/download PDF

63. Prediction for Membrane Protein Types Based on Effective Fusion Representation and MIC-GA Feature Selection

Author: Lei Guo, Shunfang Wang, Zhenfeng Lei, and Xueren Wang
Subjects: Prediction for membrane protein types, fusion representation, MIC-GA feature selection, ensemble method, stacked generalization, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Membrane proteins occupy an important position in the life activities of humans and other species. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. With the fusion of various protein information including amino acid classification, physicochemical property, and evolutionary information, this paper proposes a system for predicting membrane protein types. In this system, a new feature selection method called MIC-GA is proposed to deal with the curse of high-dimensional features. The findings show that this approach is effective in reducing feature dimensions and improves prediction accuracy. Ensemble method based on stacked generalization is also used to solve the problem of feature heterogeneity. The performance of the present method is evaluated on two benchmark datasets. The overall prediction accuracies of eight types are 89.23% and 93.49% using jackknife test and independent test, respectively. The final experimental results show that our method is more effective than the existing methods for prediction of membrane protein types.
Published: 2018
Full Text: View/download PDF

64. An analytical toast to wine: Using stacked generalization to predict wine preference.

Author: Larkin, Taylor and McManus, Denise
Subjects: *WINES, *GENERALIZATION, *MACHINE learning
Abstract: Due to the intricacies surrounding taste profiles, one's view of good wine is subjective. Therefore, it is advantageous to provide a more objective, data‐driven way to assess wine preferences. Motivated by a previous study that modeled wine preferences using machine learning algorithms, this work presents an ensemble approach to predict a wine sample's quality level given its physiochemical properties. Results show the proposed framework out‐performs many sophisticated models including the one recommended by the motivational study. Moreover, the proposed framework offers a simple variable importance strategy to gain insight as to the relevance of the predictor variables and is applied to both simulated and real data. Given the predictive power of using ensembles, especially when they can be interpretable, practitioners can use the following approach to provide an accurate and inferential perspective towards demystifying wine preferences. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

65. Stacked penalized logistic regression for selecting views in multi-view learning.

Author: van Loon, Wouter, Fokkema, Marjolein, Szabo, Botond, and de Rooij, Mark
Subjects: *MEDICAL coding, *PROCESS optimization, *FORECASTING, *ACQUISITION of data, *MEDICAL research, *LOGISTIC regression analysis
Abstract: • New view selection method based on multi-view stacking is introduced. • Importance of nonnegativity constraints in multi-view stacking is demonstrated. • New method outperforms group lasso in view selection. • New method can make use of faster algorithms and is easily parallelized. In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data can be expensive and/or burdening for patients, so that it is important to reduce the amount of required data collection. It is therefore necessary to develop multi-view learning methods which can accurately identify those views that are most important for prediction. In recent years, several biomedical studies have used an approach known as multi-view stacking (MVS), where a model is trained on each view separately and the resulting predictions are combined through stacking. In these studies, MVS has been shown to increase classification accuracy. However, the MVS framework can also be used for selecting a subset of important views. To study the view selection potential of MVS, we develop a special case called stacked penalized logistic regression (StaPLR). Compared with existing view-selection methods, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views play an important role in preventing unimportant views from entering the model. We investigate the performance of StaPLR through simulations, and consider two real data examples. We compare the performance of StaPLR with an existing view selection method called the group lasso and observe that, in terms of view selection, StaPLR is often more conservative and has a consistently lower false positive rate. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

66. A Novel Deep Fuzzy Classifier by Stacking Adversarial Interpretable TSK Fuzzy Sub-Classifiers With Smooth Gradient Information.

Author: Gu, Suhang, Chung, Fu-Lai, and Wang, Shitong
Abstract: Different from our previous stacked-structure-based deep fuzzy classifier, in this paper, we explore the distinctive role of adversarial outputs of training samples in enhancing the classification performance of a stacked-structure-based deep fuzzy classifier. In order to achieve such goals, an adversarial Takagi–Sugeno–Kang (TSK) fuzzy classifier, which is denoted as TSKa, is proposed. With the TSKa, interpretable IF parts of first-order fuzzy rules can be generated by the random selection of fixed linguistic terms along each feature. According to our theoretical analysis, adversarial outputs of training samples enhance TSKa's generalization capability, thereby, resulting in the potential feasibility of leveraging their smooth gradient information with respect to the inputs in the training input space to construct a stacked-structure-based deep fuzzy classifier. In this paper, a novel deep fuzzy classifier is devised by stacking a series of TSKa sub-classifiers and training them by a deep learning strategy. An advantage of the proposed deep fuzzy classifier is its easy yet fast training. The training of each layer consists of two basic steps: computation of the smooth gradient information of adversarial outputs with respect to the inputs, and fast training of each corresponding TSKa by the least learning machine method. Comprehensive experiments on both benchmark datasets and an industrial case demonstrate the promising performance and advantages of the proposed deep fuzzy classifier. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

67. Housing Price Prediction via Improved Machine Learning Techniques.

Author: Truong, Quang, Nguyen, Minh, Dang, Hy, and Mei, Bo
Subjects: HOME prices, FORECASTING, MACHINE learning, PRICE indexes, REGRESSION analysis
Abstract: House Price Index (HPI) is commonly used to estimate the changes in housing price. Since housing price is strongly correlated to other factors such as location, area, population, it requires other information apart from HPI to predict individual housing price. There has been a considerably large number of papers adopting traditional machine learning approaches to predict housing prices accurately, but they rarely concern about the performance of individual models and neglect the less popular yet complex models. As a result, to explore various impacts of features on prediction methods, this paper will apply both traditional and advanced machine learning approaches to investigate the difference among several advanced models. This paper will also comprehensively validate multiple techniques in model implementation on regression and provide an optimistic result for housing price prediction. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

68. A Novel Data Mining on Breast Cancer Survivability Using MLP Ensemble Learners.

Author: Salehi, Mohsen, Razmara, Jafar, and Lotfi, Shahriar
Subjects: *DATA mining, *BREAST cancer, *MULTILAYER perceptrons, *MACHINE learning, *FORECASTING, *AUTOMATIC extracting (Information science), *CANCER patients
Abstract: Breast cancer survivability has always been an important and challenging issue for researchers. Different methods have been utilized mostly based on machine learning techniques for prediction of survivability among cancer patients. The most comprehensive available database of cancer incidence is SEER in the United States, which has been frequently used for different research purposes. In this paper, a new data mining has been performed on the SEER database in order to investigate the ability of machine learning techniques for survivability prediction of breast cancer patients. To this end, the data related to breast cancer incidence have been preprocessed to remove unusable records from the dataset. In sequel, two machine learning techniques were developed based on the Multi-Layer Perceptron (MLP) learner machine including MLP stacked generalization and mixture of MLP-experts to make predictions over the database. The machines have been evaluated using K-fold cross-validation technique. The evaluation of the predictors revealed an accuracy of 84.32% and 83.86% by the mixture of MLP-experts and MLP stacked generalization methods, respectively. This indicates that the predictors can be significantly used for survivability prediction suggesting time- and cost-effective treatment for breast cancer patients. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

69. PhishStack: Evaluation of Stacked Generalization in Phishing URLs Detection.

Author: Motiur Rahman, Sheikh Shah Mohammad, Islam, Takia, and Jabiullah, Md. Ismail
Subjects: PHISHING, GENERALIZATION, ERROR rates, MACHINE learning, INFORMATION technology security
Abstract: Stacked Generalization has been assessed and evaluated in the field of Phishing URLs detection. This field has become egregious area of information security. Recently, different phishing URLs detection systems have already proposed by several researchers. But due to the lack of proper machine learning algorithm selection, the performance of those systems can be affected. A details investigation on individual machine learning classifiers on level 1 and final prediction from level 2 along with three real datasets have been presented on this paper. The performance has been evaluated by precision-recall curve, AUC-ROC curve, accuracy, misclassification rate and mean absolute error (MAE). The best AUC area obtained from Random Forest and Multi Layer Perceptron (MLP) individually. But stacked generalization provides higher accuracy of 97.44% with numeric feature set in binary classification and in multiclass feature set (dataset three), provides the performance with 97.86% of accuracy. Stacked generalization provides minimum error rate and MAE of 2.142857% with multiclass feature set which leads to a strong basement of developing an anti-phishing tools. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

70. Towards Achieving Optimal Performance using Stacked Generalization Algorithm: A Case Study of Clinical Diagnosis of Malaria Fever.

Author: Oguntimilehin, Abiodun, Adetunmbi, Olusola, and Osho, Innocent
Published: 2019

71. A Stacked Generalization Framework for City Traffic Related Geospatial Data Analysis

Author: Liu, Xiliang, Yu, Li, Peng, Peng, Lu, Feng, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Morishima, Atsuyuki, editor, Zhang, Rong, editor, Zhang, Wenjie, editor, Chang, Lijun, editor, Fu, Tom Z. J, editor, Liu, Kuien, editor, Yang, Xiaoyan, editor, Zhu, Jia, editor, and Zhang, Zhiwei, editor
Published: 2016
Full Text: View/download PDF

72. 3Nsemble: Improved Electron Microscopy Image Segmentation Performance with Stacked Generalization

Author: Bhaskaran, Shubha
Subjects: Electrical engineering, Computer science, Neurosciences, Deep Learning, Electron Microscopy, Ensemble Learning, Image Segmentation, Stacked Generalization
Abstract: Deep neural networks are widely successful for many tasks of image analysis, including image segmentation. Ensemble models are generally used on deep neural networks not only to enhance the performance but also to improve robustness of predictions. In particular, robustness is currently a limiting factor for image segmentation networks. Here we propose 3Nsemble which uses stacked generalization to improve image segmentation of Electron Microscopy (EM) image data. This research, using neurobiology data, has shown highly accurate automated segmentations of organelles that greatly benefits the study of connectomics and moves us closer to understanding the brain and brain disorders. We compare performance of a trained meta-classifier against simple averaging. The additional costs of training and applying the meta-classifier is outweighed by the benefit of improved performance. The results show improvement in performance metrics with the trained predictions and most notably saw at least a 12% increase in Intersection Over Union (IOU) score.
Published: 2020

73. MetaStackVis: Visually-Assisted Performance Evaluation of Metamodels in Stacking Ensemble Learning

Author: Ploshchik, Ilya and Ploshchik, Ilya
Abstract: Stacking, also known as stacked generalization, is a method of ensemble learning where multiple base models are trained on the same dataset, and their predictions are used as input for one or more metamodels in an extra layer. This technique can lead to improved performance compared to single layer ensembles, but often requires a time-consuming trial-and-error process. Therefore, the previously developed Visual Analytics system, StackGenVis, was designed to help users select the set of the most effective and diverse models and measure their predictive performance. However, StackGenVis was developed with only one metamodel: Logistic Regression. The focus of this Bachelor's thesis is to examine how alternative metamodels affect the performance of stacked ensembles through the use of a visualization tool called MetaStackVis. Our interactive tool facilitates visual examination of individual metamodels and metamodels' pairs based on their predictive probabilities (or confidence), various supported validation metrics, and their accuracy in predicting specific problematic data instances. The efficiency and effectiveness of MetaStackVis are demonstrated with an example based on a real healthcare dataset. The tool has also been evaluated through semi-structured interview sessions with Machine Learning and Visual Analytics experts. In addition to this thesis, we have written a short research paper explaining the design and implementation of MetaStackVis. However, this thesis provides further insights into the topic explored in the paper by offering additional findings and in-depth analysis. Thus, it can be considered a supplementary source of information for readers who are interested in diving deeper into the subject.
Published: 2023

74. Ensemble of 6 DoF Pose estimation from state-of-the-art deep methods.

Author: Ciencia de la computación e inteligencia artificial, Konputazio zientziak eta adimen artifiziala, Merino Bermejo, Ibon, Azpiazu Lozano, Jon, Remazeilles, Anthony, Sierra Araujo, Basilio, Ciencia de la computación e inteligencia artificial, Konputazio zientziak eta adimen artifiziala, Merino Bermejo, Ibon, Azpiazu Lozano, Jon, Remazeilles, Anthony, and Sierra Araujo, Basilio
Abstract: Deep learning methods have revolutionized computer vision since the appearance of AlexNet in 2012. Nevertheless, 6 degrees of freedom pose estimation is still a difficult task to perform precisely. Therefore, we propose 2 ensemble techniques to refine poses from different deep learning 6DoF pose estimation models. The first technique, merge ensemble, combines the outputs of the base models geometrically. In the second, stacked generalization, a machine learning model is trained using the outputs of the base models and outputs the refined pose. The merge method improves the performance of the base models on LMO and YCB-V datasets and performs better on the pose estimation task than the stacking strategy.
Published: 2023

75. Meta-learning for Adaptive Image Segmentation

Author: Sellaouti, Aymen, Jaâfra, Yasmina, Hamouda, Atef, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Campilho, Aurélio, editor, and Kamel, Mohamed, editor
Published: 2014
Full Text: View/download PDF

76. Ensemble of Multiple Kernel SVM Classifiers

Author: Wang, Xiaoguang, Liu, Xuan, Japkowicz, Nathalie, Matwin, Stan, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Kobsa, Alfred, editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Tanaka, Yuzuru, editor, Wahlster, Wolfgang, editor, Siekmann, Jörg, editor, Sokolova, Marina, editor, and van Beek, Peter, editor
Published: 2014
Full Text: View/download PDF

77. Diagnosis and Prediction of Large-for-Gestational-Age Fetus Using the Stacked Generalization Method.

Author: Akhtar, Faheem, Li, Jianqiang, Pei, Yan, Imran, Azhar, Rajput, Asif, Azeem, Muhammad, and Wang, Qing
Subjects: FEATURE selection, NATIONAL competency-based educational tests, FETUS, GENERALIZATION, DECISION trees, MACHINE learning, MULTIPLE imputation (Statistics), PERINATAL care
Abstract: An accurate and efficient Large-for-Gestational-Age (LGA) classification system is developed to classify a fetus as LGA or non-LGA, which has the potential to assist paediatricians and experts in establishing a state-of-the-art LGA prognosis process. The performance of the proposed scheme is validated by using LGA dataset collected from the National Pre-Pregnancy and Examination Program of China (2010–2013). A master feature vector is created to establish primarily data pre-processing, which includes a features' discretization process and the entertainment of missing values and data imbalance issues. A principal feature vector is formed using GridSearch-based Recursive Feature Elimination with Cross-Validation (RFECV) + Information Gain (IG) feature selection scheme followed by stacking to select, rank, and extract significant features from the LGA dataset. Based on the proposed scheme, different features subset are identified and provided to four different machine learning (ML) classifiers. The proposed GridSearch-based RFECV+IG feature selection scheme with stacking using SVM (linear kernel) best suits the said classification process followed by SVM (RBF kernel) and LR classifiers. The Decision Tree (DT) classifier is not suggested because of its low performance. The highest prediction precision, recall, accuracy, Area Under the Curve (AUC), specificity, and F1 scores of 0.92, 0.87, 0.92, 0.95, 0.95, and 0.89 are achieved with SVM (linear kernel) classifier using top ten principal features subset, which is, in fact higher than the baselines methods. Moreover, almost every classification scheme best performed with ten principal feature subsets. Therefore, the proposed scheme has the potential to establish an efficient LGA prognosis process using gestational parameters, which can assist paediatricians and experts to improve the health of a newborn using computer aided-diagnostic system. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

78. Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme.

Author: Chen, Kuan-Hsi, Wang, Tsai-Feng, and Hu, Yuh-Jyh
Subjects: *PROTEIN-protein interactions, *GENERALIZATION, *MACHINE learning, *GENE ontology, *LOGICAL prediction, *PROTEIN content of food
Abstract: Background: Although various machine learning-based predictors have been developed for estimating protein–protein interactions, their performances vary with dataset and species, and are affected by two primary aspects: choice of learning algorithm, and the representation of protein pairs. To improve the performance of predicting protein–protein interactions, we exploit the synergy of multiple learning algorithms, and utilize the expressiveness of different protein-pair features. Results: We developed a stacked generalization scheme that integrates five learning algorithms. We also designed three types of protein-pair features based on the physicochemical properties of amino acids, gene ontology annotations, and interaction network topologies. When tested on 19 published datasets collected from eight species, the proposed approach achieved a significantly higher or comparable overall performance, compared with seven competitive predictors. Conclusion: We introduced an ensemble learning approach for PPI prediction that integrated multiple learning algorithms and different protein-pair representations. The extensive comparisons with other state-of-the-art prediction tools demonstrated the feasibility and superiority of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

79. Using an innovative stacked ensemble algorithm for the accurate prediction of preterm birth.

Author: Ramalingam, Pari, Sandhya, Maheshwari, and Sankar, Sharmila
Subjects: *ALGORITHMS, *PREMATURE infants, *MATERNAL health services, *PREGNANT women
Abstract: Objective: A birth before the normal term of 38 weeks of gestation is called a preterm birth (PTB). It is one of the major reasons for neonatal death. The objective of this article was to predict PTB well in advance so that it was converted to a term birth. Material and Methods: This study uses the historical data of expectant mothers and an innovative stacked ensemble (SE) algorithm to predict PTB. The proposed algorithm stacks classifiers in multiple tiers. The accuracy of the classiffication is improved in every tier. Results: The experimental results from this study show that PTB can be predicted with more than 96% accuracy using innovative SE learning. Conclusion: The proposed approach helps physicians in Gynecology and Obstetrics departments to decide whether the expectant mother needs treatment. Treatment can be given to delay the birth only in patients for whom PTB is predicted, or in many cases to convert the PTB to a normal birth. This, in turn, can reduce the mortality of babies due to PTB. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

80. Stacked ensemble extreme learning machine coupled with Partial Least Squares-based weighting strategy for nonlinear multivariate calibration.

Author: Shan, Peng, Zhao, Yuhui, Wang, Qiaoyun, Sha, Xiaopeng, Lv, Xiaoyong, Peng, Silong, and Ying, Yao
Subjects: *PARTIAL least squares regression, *MACHINE learning, *RADIAL basis functions, *CALIBRATION, *LEAST squares, *ARTIFICIAL neural networks
Abstract: Abstract With its simple theory and strong implementation, extreme learning machine (ELM) becomes a competitive single hidden layer feed forward networks for nonlinear multivariate calibration in chemometrics. To improve the generalization and robustness of ELM further, stacked generalization is introduced into ELM to construct a modified ELM model called stacked ensemble ELM (SE-ELM). The SE-ELM is to create a set of sub-models by applying ELM repeatedly to different sub-regions of the spectra and then combine the predictions of those sub-models according to a weighting strategy. Three different weighting strategies are explored to implement the proposed SE-ELM, such as the Winner-takes-all (WTA) weighting strategy, the constraint non-negative least squares (CNNLS) weighing strategy and the partial least squares (PLS) weighting strategy. Furthermore, PLS is suggested to be selected as the optimal weighting method that can handle the multi-colinearity among the predictions yielded by all the sub-models. The experimental assessment of the three SE-ELM models with different weighting strategies is carried out on six real spectroscopic datasets and compared with ELM, back-propagation neural network (BPNN) and Radial basis function neural network (RBFNN), statistically tested by the Wilcoxon signed rank test. The obtained experimental results suggest that, in general, all the SE-ELM models are more robust and more accurate than traditional ELM. In particular, the proposed PLS-based weighting strategy is at least statistically not worse than, and frequently better than the other two weighting strategies, BPNN, and RBFNN. Graphical abstract Unlabelled Image Highlights • Stacked generalization is introduced into ELM to construct a modified ELM model called stacked ensemble ELM (SE-ELM). • ELM sub-models are created in different spectral interval and then the predictions are combined via a weighting strategy. • The three weighting strategies (WTA, CNNLS and PLS) are explored to implement the proposed SE-ELM. • The results shown that the SE-ELM_pls is at least statistically not worse than, and often better than the other models. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

81. DroidFusion: A Novel Multilevel Classifier Fusion Approach for Android Malware Detection.

Author: Yerima, Suleiman Y. and Sezer, Sakir
Abstract: Android malware has continued to grow in volume and complexity posing significant threats to the security of mobile devices and the services they enable. This has prompted increasing interest in employing machine learning to improve Android malware detection. In this paper, we present a novel classifier fusion approach based on a multilevel architecture that enables effective combination of machine learning algorithms for improved accuracy. The framework (called DroidFusion), generates a model by training base classifiers at a lower level and then applies a set of ranking-based algorithms on their predictive accuracies at the higher level in order to derive a final classifier. The induced multilevel DroidFusion model can then be utilized as an improved accuracy predictor for Android malware detection. We present experimental results on four separate datasets to demonstrate the effectiveness of our proposed approach. Furthermore, we demonstrate that the DroidFusion method can also effectively enable the fusion of ensemble learning algorithms for improved accuracy. Finally, we show that the prediction accuracy of DroidFusion, despite only utilizing a computational approach in the higher level, can outperform stacked generalization, a well-known classifier fusion method that employs a meta-classifier approach in its higher level. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

82. Cost-sensitive stacking: an empirical evaluation, arxiv 2301.01748

Author: Lawrance, Natalie, Guerry, Marie-Anne, Petrides, George, and Business technology and Operations
Subjects: Cost-sensitive learning, classification, Ensemble learning, stacking, Stacked generalization, Blending
Abstract: Many real-world classification problems are cost-sensitive in nature, such that the misclassification costs vary between data instances. Cost-sensitive learning adapts classification algorithms to account for differences in misclassification costs. Stacking is an ensemble method that uses predictions from several classifiers as the training data for another classifier, which in turn makes the final classification decision. While a large body of empirical work exists where stacking is applied in various domains, very few of these works take the misclassification costs into account. In fact, there is no consensus in the literature as to what cost-sensitive stacking is. In this paper we perform extensive experiments with the aim of establishing what the appropriate setup for a cost-sensitive stacking ensemble is. Our experiments, conducted on twelve datasets from a number of application domains, using real, instance-dependent misclassification costs, show that for best performance, both levels of stacking require cost-sensitive classification decision.
Published: 2023

83. Discrimination of Gentiana and Its Related Species Using IR Spectroscopy Combined with Feature Selection and Stacked Generalization

Author: Tao Shen, Hong Yu, and Yuan-Zhong Wang
Subjects: nir, ft-mir, species identification, gentiana, chemometrics, feature selection, stacked generalization, Organic chemistry, QD241-441
Abstract: Gentiana, which is one of the largest genera of Gentianoideae, most of which had potential pharmaceutical value, and applied to local traditional medical treatment. Because of the phytochemical diversity and difference of bioactive compounds among species, which makes it crucial to accurately identify authentic Gentiana species. In this paper, the feasibility of using the infrared spectroscopy technique combined with chemometrics analysis to identify Gentiana and its related species was studied. A total of 180 batches of raw spectral fingerprints were obtained from 18 species of Gentiana and Tripterospermum by near-infrared (NIR: 10,000−4000 cm−1) and Fourier transform mid-infrared (MIR: 4000−600 cm−1) spectrum. Firstly, principal component analysis (PCA) was utilized to explore the natural grouping of the 180 samples. Secondly, random forests (RF), support vector machine (SVM), and K-nearest neighbors (KNN) models were built while using full spectra (including 1487 NIR variables and 1214 FT-MIR variables, respectively). The MIR-SVM model had a higher classification accuracy rate than the other models that were based on the results of the calibration sets and prediction sets. The five feature selection strategies, VIP (variable importance in the projection), Boruta, GARF (genetic algorithm combined with random forest), GASVM (genetic algorithm combined with support vector machine), and Venn diagram calculation, were used to reduce the dimensions of the data variable in order to further reduce numbers of variables for modeling. Finally, 101 NIR and 73 FT-MIR bands were selected as the feature variables, respectively. Thirdly, stacking models were built based on the optimal spectral dataset. Most of the stacking models performed better than the full spectra-based models. RF and SVM (as base learners), combined with the SVM meta-classifier, was the optimal stacked generalization strategy. For the SG-Ven-MIR-SVM model, the accuracy (ACC) of the calibration set and validation set were both 100%. Sensitivity (SE), specificity (SP), efficiency (EFF), Matthews correlation coefficient (MCC), and Cohen’s kappa coefficient (K) were all 1, which showed that the model had the optimal authenticity identification performance. Those parameters indicated that stacked generalization combined with feature selection is probably an important technique for improving the classification model predictive accuracy and avoid overfitting. The study result can provide a valuable reference for the safety and effectiveness of the clinical application of medicinal Gentiana.
Published: 2020
Full Text: View/download PDF

84. Precipitation Nowcasting with Orographic Enhanced Stacked Generalization: Improving Deep Learning Predictions on Extreme Events

Author: Gabriele Franch, Daniele Nerini, Marta Pendesini, Luca Coviello, Giuseppe Jurman, and Cesare Furlanello
Subjects: rainfall, nowcasting, deep learning, stacked generalization, convolutional recurrent neural networks, data augmentation, conditional bias, ensemble forecasting, Meteorology. Climatology, QC851-999
Abstract: One of the most crucial applications of radar-based precipitation nowcasting systems is the short-term forecast of extreme rainfall events such as flash floods and severe thunderstorms. While deep learning nowcasting models have recently shown to provide better overall skill than traditional echo extrapolation models, they suffer from conditional bias, sometimes reporting lower skill on extreme rain rates compared to Lagrangian persistence, due to excessive prediction smoothing. This work presents a novel method to improve deep learning prediction skills in particular for extreme rainfall regimes. The solution is based on model stacking, where a convolutional neural network is trained to combine an ensemble of deep learning models with orographic features, doubling the prediction skills with respect to the ensemble members and their average on extreme rain rates, and outperforming them on all rain regimes. The proposed architecture was applied on the recently released TAASRAD19 radar dataset: the initial ensemble was built by training four models with the same TrajGRU architecture over different rainfall thresholds on the first six years of the dataset, while the following three years of data were used for the stacked model. The stacked model can reach the same skill of Lagrangian persistence on extreme rain rates while retaining superior performance on lower rain regimes.
Published: 2020
Full Text: View/download PDF

85. Ensemble Learning for Sentiment Classification

Author: Su, Ying, Zhang, Yong, Ji, Donghong, Wang, Yibing, Wu, Hongmiao, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Ji, Donghong, editor, and Xiao, Guozheng, editor
Published: 2013
Full Text: View/download PDF

86. Comparing ARIMA and computational intelligence methods to forecast daily hospital admissions due to circulatory and respiratory causes in Madrid.

Author: Navares, Ricardo, Díaz, Julio, Linares, Cristina, and Aznarte, José L.
Subjects: *TIME series analysis, *REGRESSION analysis, *PREDICTION models, *ALGORITHMS, *PUBLIC health
Abstract: Anticipating future workloads in a hospital may be of capital importance in order to distribute resources and improve patient attention. In this paper, we tackle the problem of predicting daily hospital admissions in Madrid due to circulatory and respiratory cases based on biometeorological indicators. A range of forecasting algorithms were proposed covering four model families: ensemble methods, boosting methods, artificial neural networks and ARIMA. Experiments show how the last two obtain better results in average, demonstrating that the problem can be properly solved with both approaches. Furthermore, a recently proposed technique known as stacked generalization was also used to dynamically combine the predictions from the four models, finally improving the performance with respect to the individual models. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

87. Deep Takagi–Sugeno–Kang Fuzzy Classifier With Shared Linguistic Fuzzy Rules.

Author: Zhang, Yuanpeng, Ishibuchi, Hisao, and Wang, Shitong
Subjects: FUZZY sets, MATHEMATICAL optimization, MACHINE learning
Abstract: In many practical applications of classifiers, not only high accuracy but also high interpretability is required. Among a wide variety of existing classifiers, Takagi–Sugeno–Kang (TSK) fuzzy classifiers may be one of the best choices for achieving a good balance between interpretability and accuracy. In order to further improve their accuracy without losing their interpretability, we propose a highly interpretable deep TSK fuzzy classifier HID-TSK-FC (deep shared-linguistic-rule-based TSK fuzzy classifier) based on the concept of shared linguistic fuzzy rules. The proposed classifier has two characteristics: One is a stacked hierarchical structure of component TSK fuzzy classifiers for high accuracy, and the other is the use of interpretable linguistic rules with the same set of linguistic labels for all inputs. High interpretability is achieved at each layer by using the same set of linguistic values for all inputs, including the outputs from the previous layers in the stacked hierarchical structure. We show that a linguistic rule with the outputs from the previous layers as its inputs is equivalent to a fuzzy rule with a nonlinear consequent or a linear consequent with a certainty factor. We also show that HID-TSK-FC is mathematically equivalent to a novel TSK fuzzy classifier with shared interpretable linguistic fuzzy rules. Promising performance of HID-TSK-FC is demonstrated through extensive computational experiments on benchmark datasets and a real-world application case. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

88. Efficient 2D and 3D Facade Segmentation Using Auto-Context.

Author: Gadde, Raghudeep, Jampani, Varun, Marlet, Renaud, and Gehler, Peter V.
Subjects: *IMAGE segmentation, *FEATURE extraction, *THREE-dimensional imaging, *DECISION trees, *MACHINE learning
Abstract: This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades. Facades of buildings are highly structured and consequently most methods that have been proposed for this problem aim to make use of this strong prior information. Contrary to most prior work, we are describing a system that is almost domain independent and consists of standard segmentation methods. We train a sequence of boosted decision trees using auto-context features. This is learned using stacked generalization. We find that this technique performs better, or comparable with all previous published methods and present empirical results on all available 2D and 3D facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test-time inference. [ABSTRACT FROM PUBLISHER]
Published: 2018
Full Text: View/download PDF

89. Stacked generalization: an introduction to super learning.

Author: Naimi, Ashley I. and Balzer, Laura B.
Subjects: PREDICTION models, RECEIVER operating characteristic curves, MACHINE learning, EPIDEMIOLOGISTS, MEAN square algorithms
Abstract: Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into a host of methods among which is the “Super Learner”. Super Learner uses V-fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of Super Learner by epidemiologists has been hampered by limitations in understanding conceptual and technical details. We work step-by-step through two examples to illustrate concepts and address common concerns. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

90. Multi-view stacking for activity recognition with sound and accelerometer data.

Author: Garcia-Ceja, Enrique, Galván-Tejada, Carlos E., and Brena, Ramon
Subjects: *HUMAN activity recognition, *AMBIENT intelligence, *WEARABLE technology, *HETEROGENEOUS computing, *ACCELEROMETERS
Abstract: Many Ambient Intelligence (AmI) systems rely on automatic human activity recognition for getting crucial context information, so that they can provide personalized services based on the current users’ state. Activity recognition provides core functionality to many types of systems including: Ambient Assisted Living, fitness trackers, behavior monitoring, security, and so on. The advent of wearable devices along with their diverse set of embedded sensors opens new opportunities for ubiquitous context sensing. Recently, wearable devices such as smartphones and smart-watches have been used for activity recognition and monitoring. Most of the previous works use inertial sensors (accelerometers, gyroscopes) for activity recognition and combine them using an aggregation approach, i.e., extract features from each sensor and aggregate them to build the final classification model. This is not optimal since each sensor data source has its own statistical properties. In this work, we propose the use of a multi-view stacking method to fuse the data from heterogeneous types of sensors for activity recognition. Specifically, we used sound and accelerometer data collected with a smartphone and a wrist-band while performing home task activities. The proposed method is based on multi-view learning and stacked generalization, and consists of training a model for each of the sensor views and combining them with stacking. Our experimental results showed that the multi-view stacking method outperformed the aggregation approach in terms of accuracy, recall and specificity. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

91. Automatic classification of colorectal and prostatic histologic tumor images using multiscale multispectral local binary pattern texture features and stacked generalization.

Author: Peyret, Rémy, Bouridane, Ahmed, Khelifi, Fouad, Tahir, Muhammad Atif, and Al-Maadeed, Somaya
Subjects: *COLON cancer diagnosis, *PROSTATE tumors, *FEATURE extraction, *GENERALIZATION, *PROSTATE biopsy, *DIAGNOSIS
Abstract: This paper proposes a new multispectral multiscale local binary pattern feature extraction method for automatic classification of colorectal and prostatic tumor biopsies samples. A multilevel stacked generalization classification technique is also proposed and the key idea of the paper considers a grade diagnostic problem rather than a simple malignant versus tumorous tissue problem using the concept of multispectral imagery in both the visible and near infrared spectra. To validate the proposed algorithm performances, a comparative study against related works using multispectral imagery is conducted including an evaluation on three different multiclass datasets of multispectral histology images: two representing images of colorectal biopsies - one dataset was acquired in the visible spectrum while the second captures near-infrared spectra. The proposed algorithm achieves an accuracy of 99.6% on the different datasets. The results obtained demonstrate the advantages of infrared wavelengths to capture more efficiently the most discriminative information. The results obtained show that our proposed algorithm outperforms other similar methods. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

92. Ensemble of 6 DoF Pose estimation from state-of-the-art deep methods

Author: Merino Bermejo, Ibon, Azpiazu Lozano, Jon, Remazeilles, Anthony, and Sierra Araujo, Basilio
Subjects: stacked generalization, Artificial Intelligence, Cognitive Neuroscience, ensemble, deep learning, pose estimation, Computer Science Applications
Abstract: Deep learning methods have revolutionized computer vision since the appearance of AlexNet in 2012. Nevertheless, 6 degrees of freedom pose estimation is still a difficult task to perform precisely. Therefore, we propose 2 ensemble techniques to refine poses from different deep learning 6DoF pose estimation models. The first technique, merge ensemble, combines the outputs of the base models geometrically. In the second, stacked generalization, a machine learning model is trained using the outputs of the base models and outputs the refined pose. The merge method improves the performance of the base models on LMO and YCB-V datasets and performs better on the pose estimation task than the stacking strategy. This paper has been supported by the project PROFLOW under the Basque program ELKARTEK, grant agreement No. KK-2022/00024.
Published: 2023

93. A stacked generalization system for automated FOREX portfolio trading.

Author: Petropoulos, Anastasios, Chatzis, Sotirios P., Siakoulis, Vasilis, and Vlachogiannakis, Nikos
Subjects: *PORTFOLIO management (Investments), *FOREIGN exchange, *INTERNATIONAL finance, *MACHINE learning, *ALGORITHMS
Abstract: Multiple FOREX time series forecasting is a hot research topic in the literature of portfolio trading. To this end, a large variety of machine learning algorithms have been examined. However, it is now widely understood that, in real-world trading settings, no single machine learning model can consistently outperform the alternatives. In this work, we examine the efficacy and the feasibility of developing a stacked generalization system, intelligently combining the predictions of diverse machine learning models. Our approach establishes a novel inferential framework that comprises the following levels of data processing: (i) We model the dependence patterns between major currency pairs via a diverse set of commonly used machine learning algorithms, namely support vector machines (SVMs), random forests (RFs), Bayesian autoregressive trees (BART), dense-layer neural networks (NNs), and naïve Bayes (NB) classifiers. (ii) We generate implied signals of exchange rate fluctuation, based on the output of these models, as well as appropriate side information obtained by analyzing the correlations across currency pairs in our training datasets. (iii) We finally combine these implied signals into an aggregate predictive waveform, by leveraging majority voting, genetic algorithm optimization, and regression weighting techniques. We thoroughly test our framework in real-world trading scenarios; we show that our system leads to significantly better trading performance than the considered benchmarks. Thus, it represents an attractive solution for financial firms and corporations that perform foreign exchange portfolio management and daily trading. Our system can be used as an integrated part in international commercial trade activities or in a quantitative investing framework for algorithmic trading and carry-trade speculation. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

94. Development and validation of various phenotyping algorithms for Diabetes Mellitus using data from electronic health records.

Author: Esteban, Santiago, Rodríguez Tablado, Manuel, Peper, Francisco E., Mahumud, Yamila S., Ricci, Ricardo I., Kopitowski, Karin S., and Terrasa, Sergio A.
Subjects: *DIABETES, *PHENOTYPES, *ELECTRONIC health records, *INDIVIDUALIZED medicine, *STATISTICAL learning
Abstract: Background and Objective Recent progression towards precision medicine has encouraged the use of electronic health records (EHRs) as a source for large amounts of data, which is required for studying the effect of treatments or risk factors in more specific subpopulations. Phenotyping algorithms allow to automatically classify patients according to their particular electronic phenotype thus facilitating the setup of retrospective cohorts. Our objective is to compare the performance of different classification strategies (only using standardized problems, rule-based algorithms, statistical learning algorithms (six learners) and stacked generalization (five versions)), for the categorization of patients according to their diabetic status (diabetics, not diabetics and inconclusive; Diabetes of any type) using information extracted from EHRs. Methods Patient information was extracted from the EHR at Hospital Italiano de Buenos Aires, Buenos Aires, Argentina. For the derivation and validation datasets, two probabilistic samples of patients from different years (2005: n = 1663; 2015: n = 800) were extracted. The only inclusion criterion was age (≥40 & <80 years). Four researchers manually reviewed all records and classified patients according to their diabetic status (diabetic: diabetes registered as a health problem or fulfilling the ADA criteria; non-diabetic: not fulfilling the ADA criteria and having at least one fasting glycemia below 126 mg/dL; inconclusive: no data regarding their diabetic status or only one abnormal value). The best performing algorithms within each strategy were tested on the validation set. Results The standardized codes algorithm achieved a Kappa coefficient value of 0.59 (95% CI 0.49, 0.59) in the validation set. The Boolean logic algorithm reached 0.82 (95% CI 0.76, 0.88). A slightly higher value was achieved by the Feedforward Neural Network (0.9, 95% CI 0.85, 0.94). The best performing learner was the stacked generalization meta-learner that reached a Kappa coefficient value of 0.95 (95% CI 0.91, 0.98). Conclusions The stacked generalization strategy and the feedforward neural network showed the best classification metrics in the validation set. The implementation of these algorithms enables the exploitation of the data of thousands of patients accurately. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

95. StackNet-DenVIS: a multi-layer perceptron stacked ensembling approach for COVID-19 detection using X-ray images

Author: Autee, Pratik, Bagwe, Sagar, Shah, Vimal, and Srivastava, Kriti
Published: 2020
Full Text: View/download PDF

96. On the Performance of Stacked Generalization Classifiers

Author: Ozay, Mete, Vural, Fatos Tunay Yarman, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Campilho, Aurélio, editor, and Kamel, Mohamed, editor
Published: 2008
Full Text: View/download PDF

97. Evaluation of stacked ensemble model performance to predict clinical outcomes: A COVID-19 study.

Author: Kablan, Rianne, Miller, Hunter A., Suliman, Sally, and Frieboes, Hermann B.
Published: 2023
Full Text: View/download PDF

98. Predicting the potential distribution of wheatear birds using stacked generalization-based ensembles.

Author: El Alaoui, Omar and Idri, Ali
Subjects: MACHINE learning, SPECIES distribution, NUMBERS of species, CLIMATE change, K-nearest neighbor classification, MULTILAYER perceptrons, BOOSTING algorithms
Abstract: Habitat suitability models, usually referred to as species distribution models (SDMs), are widely applied in ecology for many purposes, including species conservation, habitat discovery, and gain evolutionary insights by estimating the distribution of species. Machine learning algorithms as well as statistical models have been recently used to predict the distribution of species. However, they seemed to have some limitations due to the data and the models used. Therefore, this study proposes a novel approach for assessing habitat suitability based on ensemble learning techniques. Three heterogeneous ensembles were built using the stacked generalization method to model the distribution of four wheatear species (Oenanthe d eserti, Oenanthe leucopyga, Oenanthe leucura, and Oenanthe oenanthe) located in Morocco. Initially, a set of base-learners were constructed by virtue of training for each specie's dataset six machine learning algorithms (Multi-Layer Perceptron (MLP), Support Vector Classifier (SVC), K-nearest neighbors (KNN), Decision Trees (DT), Gradient Boosting Classifier (GB), and Random Forest (RF)). Then, the predictions of these base learners were fed as training data to train three meta-learners (Logistic Regression (LR), SVC, and MLP). To evaluate and assess the performance of the proposed approaches, we used: (1) six performance criteria (accuracy, recall, precision, F1-score, AUC, and TSS), (2) Borda Count (BC) ranking method based on multiple criteria to rank the best-performing models, and (3) Scott Knott (SK) test to statistically compare the performance of the presented models. The results based on the six-evaluation metrics showed that stacked ensembles outperformed their singles in all species datasets, and the stacked model with SVC as a meta-learner outperformed the other two ensembles. The results showed the potential of using ensemble learning techniques to model species distribution and recommend the use of the stacked generalization technique as a combination strategy since it gave better results compared to single models in four wheatear species datasets. Moreover, to assess the impact of future climate changes on the distribution of the four wheatear species, the best-performing distribution model was selected and projected into the current and future climatic conditions. The distributions of the Moroccan wheatear birds were found to be slightly affected by future climate changes. • Evaluate 6 ML models in predicting the distribution of 4 wheatear birds in Morocco. • Proposes 3 stacked generalization ensembles using 6 base learners & 3 meta-models. • Evaluate whether stacked ensembles perform better than single models. • Analyzes impact of future climate changes on distribution of 4 wheatear bird species. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

99. Change-Point Detection in Business Cycles using Machine Learning Algorithms

Author: Pérez Quirós, Gabriel, Carnero, M. Angeles, Universidad de Alicante. Departamento de Fundamentos del Análisis Económico, Masoudnia, Mohammadreza, Pérez Quirós, Gabriel, Carnero, M. Angeles, Universidad de Alicante. Departamento de Fundamentos del Análisis Económico, and Masoudnia, Mohammadreza
Abstract: Turning points in business cycles are defined as the onset of a recession or an expansion which are quite difficult to be predicted. In this thesis, we approach the problem of turning (change) point detection as the viewpoint of binary classification task. Due to the small ratio of changes to total data (as the number of recessions is relatively low), we face heavily class-imbalance challenge in this problem. We explore a wide variety of machine learning-based solutions for this problem: from base classifier to the multi-step classifier ensemble algorithm as well as a feature selection step. We examined the proposed classification methods on Canadian large dataset. Among different examined methods, the hybrid ensemble method including data sampling followed by a feature selection and multi-step ensemble can predict the Covid19 recession’s changepoints precisely with all the time series available one month ago. Some robustness checks such as the effect of window size on the model performance are also provided. Moreover, excluding the financial crisis from the training set, the method 8 is still able to predict the changepoints in the case of financial crisis precisely, however, in the case of the Covid-19 recession, they were detected one-period late, suggesting importance of financial crisis’ data in detecting Covid-19 change points.
Published: 2022

100. Evaluating StackGenVis with a Comparative User Study

Author: Chatzimparmpas, Angelos, Park, Vilhelm, Kerren, Andreas, Chatzimparmpas, Angelos, Park, Vilhelm, and Kerren, Andreas
Abstract: Stacked generalization (also called stacking) is an ensemble method in machine learning that deploys a metamodel to summarize the predictive results of heterogeneous base models organized into one or more layers. Despite being capable of producing high-performance results, building a stack of models can be a trial-and-error procedure. Thus, our previously developed visual analytics system, entitled StackGenVis, was designed to monitor and control the entire stacking process visually. In this work, we present the results of a comparative user study we performed for evaluating the StackGenVis system. We divided the study participants into two groups to test the usability and effectiveness of StackGenVis compared to Orange Visual Stacking (OVS) in an exploratory usage scenario using healthcare data. The results indicate that StackGenVis is significantly more powerful than OVS based on the qualitative feedback provided by the participants. However, the average completion time for all tasks was comparable between both tools.
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

260 results on '"stacked generalization"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources