10 results
Search Results
2. Hybrid Reptile Search Algorithm and Remora Optimization Algorithm for Optimization Tasks and Data Clustering.
- Author
-
Almotairi, Khaled H. and Abualigah, Laith
- Subjects
SEARCH algorithms ,MATHEMATICAL optimization ,REPTILES ,MACHINE learning ,DATA mining - Abstract
Data clustering is a complex data mining problem that clusters a massive amount of data objects into a predefined number of clusters; in other words, it finds symmetric and asymmetric objects. Various optimization methods have been used to solve different machine learning problems. They usually suffer from local optimal problems and unbalance between the search mechanisms. This paper proposes a novel hybrid optimization method for solving various optimization problems. The proposed method is called HRSA, which combines the original Reptile Search Algorithm (RSA) and Remora Optimization Algorithm (ROA) and handles these mechanisms' search processes by a novel transition method. The proposed HRSA method aims to avoid the main weaknesses raised by the original methods and find better solutions. The proposed HRSA is tested on solving various complicated optimization problems—twenty-three benchmark test functions and eight data clustering problems. The obtained results illustrate that the proposed HRSA method performs significantly better than the original and comparative state-of-the-art methods. The proposed method overwhelmed all the comparative methods according to the mathematical problems. It obtained promising results in solving the clustering problems. Thus, HRSA has a remarkable efficacy when employed for various clustering problems. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. A new method of mining data streams using harmony search.
- Author
-
Karimi, Zohre, Abolhassani, Hassan, and Beigy, Hamid
- Subjects
MACHINE learning ,ALGORITHMS ,DATA mining ,MATHEMATICAL optimization ,INFORMATION retrieval - Abstract
Incremental learning has been used extensively for data stream classification. Most attention on the data stream classification paid on non-evolutionary methods. In this paper, we introduce new incremental learning algorithms based on harmony search. We first propose a new classification algorithm for the classification of batch data called harmony-based classifier and then give its incremental version for classification of data streams called incremental harmony-based classifier. Finally, we improve it to reduce its computational overhead in absence of drifts and increase its robustness in presence of noise. This improved version is called improved incremental harmony-based classifier. The proposed methods are evaluated on some real world and synthetic data sets. Experimental results show that the proposed batch classifier outperforms some batch classifiers and also the proposed incremental methods can effectively address the issues usually encountered in the data stream environments. Improved incremental harmony-based classifier has significantly better speed and accuracy on capturing concept drifts than the non-incremental harmony based method and its accuracy is comparable to non-evolutionary algorithms. The experimental results also show the robustness of improved incremental harmony-based classifier. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
4. Maxi-Min Margin Machine: Learning Large Margin Classifiers Locally and Globally.
- Author
-
Kaizhu Huang, Haiqin Yang, King, Irwin, and Lyu, Michael R.
- Subjects
MACHINE learning ,ARTIFICIAL intelligence ,MACHINE theory ,DATA mining ,STATISTICAL correlation ,MULTIVARIATE analysis ,MATHEMATICAL optimization ,COMPUTER programming ,ALGORITHMS - Abstract
In this paper, we propose a novel large margin classifier, called the maxi-min margin machine (M
4 ). This model learns the decision boundary both locally and globally. In comparison, other large margin classifiers construct separating hyperplanes only either locally or globally. For example, a state-of-the-art large margin classifier, the support vector machine (SVM), considers data only locally, while another significant model, the minimax probability machine (MPM), focuses on building the decision hyperplane exclusively based on the global information. As a major contribution, we show that SVM yields the same solution as M4 when data satisfy certain conditions, and MPM can be regarded as a relaxation model of M4 . Moreover, based on our proposed local and global view of data, another popular model, the linear discriminant analysis, can easily be interpreted and extended as well. We describe the M4 model definition, provide a geometrical interpretation, present theoretical justifications, and propose a practical sequential conic programming method to solve the optimization problem. We also show how to exploit Mercer kernels to extend M4 for nonlinear classifications. Furthermore, we perform a series of evaluations on both synthetic data sets and real-world benchmark data sets. Comparison with SVM and MPM demonstrates the advantages of our new model. [ABSTRACT FROM AUTHOR]- Published
- 2008
- Full Text
- View/download PDF
5. Improved salp swarm algorithm based on the levy flight for feature selection.
- Author
-
Balakrishnan, K., Dhanalakshmi, R., and Khaire, Utkarsh Mahadeo
- Subjects
ALGORITHMS ,FEATURE selection ,DATA mining ,MACHINE learning ,DATA science ,MATHEMATICAL optimization - Abstract
The fields of data science and data mining are enduring high-dimensionality issues because of a high volume of data. Conventional machine learning techniques give disgruntled responses to high-dimensional datasets. Feature selection is used to get the appropriate information from the dataset to reduce the dimensionality of the data. The recently proposed Salp Swarm Algorithm (SSA) is a population-based meta-heuristic optimization technique inspired by the Sea Salps Swarming technique. SSA failed to converge initial random solutions to the global optimum owing to its complete dependency on the number of iterations for the process of exploration and exploitation. The proposed improved SSA (iSSA) aims to enhance the ability of Salps to explore divergent areas by randomly updating its location. Randomizing the Salps location via Levy flight enriches the exploitation potential of SSA resulting in it converging the model toward the global optima. The performance of the proposed iSSA is investigated using six different high-dimensional microarray datasets. While comparing the ability to converge, it is understood that the proposed model outperforms SSA providing 0.1033% more confidence in the selected features. The results of the simulation revealed that the iSSA can provide better competitive and significant results compared to SSA. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
6. Ant Colony Optimization Algorithm for Interpretable Bayesian Classifiers Combination: Application to Medical Predictions.
- Author
-
Bouktif, Salah, Hanna, Eileen Marie, Zaki, Nazar, and Khousa, Eman Abu
- Subjects
ANT algorithms ,MACHINE learning ,PREDICTION models ,COMPUTER users ,MEDICAL informatics ,DATA mining ,MEDICAL genetics - Abstract
: Prediction and classification techniques have been well studied by machine learning researchers and developed for several real-word problems. However, the level of acceptance and success of prediction models are still below expectation due to some difficulties such as the low performance of prediction models when they are applied in different environments. Such a problem has been addressed by many researchers, mainly from the machine learning community. A second problem, principally raised by model users in different communities, such as managers, economists, engineers, biologists, and medical practitioners, etc., is the prediction models’ interpretability. The latter is the ability of a model to explain its predictions and exhibit the causality relationships between the inputs and the outputs. In the case of classification, a successful way to alleviate the low performance is to use ensemble classiers. It is an intuitive strategy to activate collaboration between different classifiers towards a better performance than individual classier. Unfortunately, ensemble classifiers method do not take into account the interpretability of the final classification outcome. It even worsens the original interpretability of the individual classifiers. In this paper we propose a novel implementation of classifiers combination approach that does not only promote the overall performance but also preserves the interpretability of the resulting model. We propose a solution based on Ant Colony Optimization and tailored for the case of Bayesian classifiers. We validate our proposed solution with case studies from medical domain namely, heart disease and Cardiotography-based predictions, problems where interpretability is critical to make appropriate clinical decisions. Availability: The datasets, Prediction Models and software tool together with supplementary materials are available at http://faculty.uaeu.ac.ae/salahb/ACO4BC.htm. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
7. CCH-based geometric algorithms for SVM and applications.
- Author
-
Xin-jun Peng and Yi-fei Wang
- Subjects
MACHINE learning ,DATA mining ,ALGORITHMS ,PATTERN perception ,QUADRATIC programming ,MATHEMATICAL optimization - Abstract
The support vector machine (SVM) is a novel machine learning tool in data mining. In this paper, the geometric approach based on the compressed convex hull (CCH) with a mathematical framework is introduced to solve SVM classification problems. Compared with the reduced convex hull (RCH), CCH preserves the shape of geometric solids for data sets; meanwhile, it is easy to give the necessary and sufficient condition for determining its extreme points. As practical applications of CCH, spare and probabilistic speed-up geometric algorithms are developed. Results of numerical experiments show that the proposed algorithms can reduce kernel calculations and display nice performances. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
8. Adaptive feature selection using v-shaped binary particle swarm optimization.
- Author
-
Teng, Xuyang, Dong, Hongbin, and Zhou, Xiurong
- Subjects
PARTICLE swarm optimization ,DATA mining ,ENTROPY ,COMPUTER algorithms ,MATHEMATICAL optimization - Abstract
Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
9. Activity preserving graph simplification.
- Author
-
Bonchi, Francesco, Francisci Morales, Gianmarco, Gionis, Aristides, and Ukkonen, Antti
- Subjects
GRAPHIC methods ,ALGORITHMS ,MATHEMATICAL optimization ,COMBINATORICS ,DATA mining ,MACHINE learning - Abstract
We study the problem of simplifying a given directed graph by keeping a small subset of its arcs. Our goal is to maintain the connectivity required to explain a set of observed traces of information propagation across the graph. Unlike previous work, we do not make any assumption about an underlying model of information propagation. Instead, we approach the task as a combinatorial problem. We prove that the resulting optimization problem is $$\mathbf{NP}$$ -hard. We show that a standard greedy algorithm performs very well in practice, even though it does not have theoretical guarantees. Additionally, if the activity traces have a tree structure, we show that the objective function is supermodular, and experimentally verify that the approach for size-constrained submodular minimization recently proposed by Nagano et al. (28th International Conference on Machine Learning, ) produces very good results. Moreover, when applied to the task of reconstructing an unobserved graph, our methods perform comparably to a state-of-the-art algorithm devised specifically for this task. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
10. A non-linear data mining parameter selection algorithm for continuous variables
- Author
-
Peyman Tavallali, Sean F. Brady, and Marianne Razavi
- Subjects
Computer science ,lcsh:Medicine ,02 engineering and technology ,Cardiovascular Medicine ,01 natural sciences ,Least squares ,Stiffness ,Machine Learning ,010104 statistics & probability ,Mathematical and Statistical Techniques ,Medicine and Health Sciences ,0202 electrical engineering, electronic engineering, information engineering ,Data Mining ,lcsh:Science ,Multidisciplinary ,Artificial neural network ,Applied Mathematics ,Simulation and Modeling ,Regression analysis ,Regression ,Cardiovascular Diseases ,Physical Sciences ,Information Technology ,Algorithms ,Research Article ,Computer and Information Sciences ,Mathematical optimization ,Neural Networks ,Mean squared error ,Materials Science ,Material Properties ,Research and Analysis Methods ,Machine Learning Algorithms ,Artificial Intelligence ,Mechanical Properties ,Least-Squares Analysis ,0101 mathematics ,Selection algorithm ,Artificial Neural Networks ,Computational Neuroscience ,Model selection ,lcsh:R ,Biology and Life Sciences ,Computational Biology ,020206 networking & telecommunications ,Models, Theoretical ,Multicollinearity ,lcsh:Q ,Mathematical Functions ,Mathematics ,Neuroscience - Abstract
In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables.
- Published
- 2017
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.