20 results on '"Garcìa Escudero, L"'
Search Results
2. Robust estimation of mixtures of regressions with random covariates, via trimming and constraints
- Author
-
Garcia-Escudero, L. A., Gordaliza, A., Greselin, F., Ingrassia, S., and Mayo-Iscar, A.
- Subjects
Statistics - Methodology - Abstract
A robust estimator for a wide family of mixtures of linear regression is presented. Robustness is based on the joint adoption of the Cluster Weighted Model and of an estimator based on trimming and restrictions. The selected model provides the conditional distribution of the response for each group, as in mixtures of regression, and further supplies local distributions for the explanatory variables. A novel version of the restrictions has been devised, under this model, for separately controlling the two sources of variability identified in it. This proposal avoids singularities in the log-likelihood, caused by approximate local collinearity in the explanatory variables or local exact fit in regressions, and reduces the occurrence of spurious local maximizers. In a natural way, due to the interaction between the model and the estimator, the procedure is able to resist the harmful influence of bad leverage points along the estimation of the mixture of regressions, which is still an open issue in the literature. The given methodology defines a well-posed statistical problem, whose estimator exists and is consistent to the corresponding solution of the population optimum, under widely general conditions. A feasible EM algorithm has also been provided to obtain the corresponding estimation. Many simulated examples and two real datasets have been chosen to show the ability of the procedure, on the one hand, to detect anomalous data, and, on the other hand, to identify the real cluster regressions without the influence of contamination.
- Published
- 2015
3. Grouping Around Different Dimensional Affine Subspaces
- Author
-
García-Escudero, L. A., Gordaliza, A., Matrán, C., Mayo-Iscar, A., Giudici, Paolo, editor, Ingrassia, Salvatore, editor, and Vichi, Maurizio, editor
- Published
- 2013
- Full Text
- View/download PDF
4. Exploring solutions via monitoring for cluster weighted robust models
- Author
-
Porzio, GC, Rampichini, C, Bocci, C, Cappozzo, A, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, L. A, Greselin, F., Mayo-Iscar, A., Porzio, GC, Rampichini, C, Bocci, C, Cappozzo, A, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, L. A, Greselin, F., and Mayo-Iscar, A.
- Abstract
Depending on the selected hyper-parameters, cluster weighted modeling may produce a set of diverse solutions. Particularly, the user can manually specify the number of mixture components, the degree of heteroscedasticity of the clusters in the explanatory variables and of the errors around the regression lines. In addition, when performing robust inference, the level of impartial trimming enforced in the estimation needs to be selected. This flexibility gives rise to a variety of “legitimate” solutions. To mitigate the problem of model selection, we propose a two stage monitoring procedure to identify a set of “good models”. An application to the benchmark tone perception data showcases the benefits of the approach.
- Published
- 2021
5. Robust Linear Clustering
- Author
-
García-Escudero, L. A., Gordaliza, A., San Martín, R., Van Aelst, S., and Zamar, R.
- Published
- 2009
6. Exploring solutions via monitoring for cluster weighted robust models
- Author
-
Cappozzo, A, Garcìa-Escudero, L. A, Greselin, F., Mayo-Iscar, A., Porzio, GC, Rampichini, C, Bocci, C, Cappozzo, A, Garcìa-Escudero, L, Greselin, F, and Mayo-Iscar, A
- Subjects
SECS-S/01 - STATISTICA ,Cluster-weighted modeling, Outliers, Trimmed BIC, Eigenvalue constraint, Monitoring, Constrained estimation, Model-based clustering - Abstract
Depending on the selected hyper-parameters, cluster weighted modeling may produce a set of diverse solutions. Particularly, the user can manually specify the number of mixture components, the degree of heteroscedasticity of the clusters in the explanatory variables and of the errors around the regression lines. In addition, when performing robust inference, the level of impartial trimming enforced in the estimation needs to be selected. This flexibility gives rise to a variety of “legitimate” solutions. To mitigate the problem of model selection, we propose a two stage monitoring procedure to identify a set of “good models”. An application to the benchmark tone perception data showcases the benefits of the approach.
- Published
- 2021
7. Robust estimation of mixtures of Skew Normal Distributions
- Author
-
Garcìa-Escudero, L, Greselin, F, McLachlan, G, Mayo-Iscar, A, Garcìa-Escudero, L, Greselin, F, Mclachlan, G, and Mayo-Iscar, A
- Subjects
Clustering, Robustness, Trimming, Constrained estimation, Skew data, model-based classification, Finite mixture models ,SECS-S/01 - STATISTICA - Abstract
Recently, observed departures from the classical Gaussian mixture model in real datasets motivated the introduction of mixtures of skew t, and remarkably widened the application of model based clustering and classification to great many real datasets. Unfortunately,when data contamination occurs, classical inference for these models could be severely affected. In this paper we introduce robust estimation of mixtures of skew normal, to resist sparse outliers and even pointwise contamination that may arise in data collection. Hence, in each component, the skewed nature of the data is explicitly modeled, while any departure from it is dealt by the robust approach. Some applications on real data show the effectiveness of the proposal
- Published
- 2016
8. Extending robust fuzzy clustering to skew data.
- Author
-
Ana Colubi, Erricos J. Kontoghiorghes and Herman K. Van Dijk, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, LA, Ana Colubi, Erricos J. Kontoghiorghes and Herman K. Van Dijk, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, and Garcìa-Escudero, LA
- Abstract
Clustering is an important technique in exploratory data analysis, with applications in image processing, object classification, target recognition, data mining etc. The aim is to partition data according to natural classes present in it, assigning data points that are more similar to the same cluster. We solved this ill-posed problem by adopting a fuzzy clustering method, based on mixtures of skew Gaussian, endowed by the joint usage of trimming and constrained estimation of scatter matrices. A set of membership values are used to fuzzy partition the data and to contribute to the robust estimates of the mixture parameters. The purpose is to adopt the basic skew Gaussian component for the mixture and apply impartial trimming to the data, to model the skew core of the clusters and to adapt to any type of tail behaviour. The choice of the skew Gaussian components is motivated by the fact that, with the increased availability of multivariate datasets, often underlying asymmetric structures appear. In these cases, the extremely useful paradigm for clustering given by the mixtures of Gaussian distributions appeared somehow unrealistic. Moreover, impartial trimming provides robust ML estimation, even in presence of outliers in the data. Finally, synthetic and real data are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.
- Published
- 2018
9. Fuzzy clustering of multivariate skew data
- Author
-
Colubi, A, Gatu, C, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, LA, Colubi, A, Gatu, C, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, and Garcìa-Escudero, LA
- Abstract
With the increasing availability of multivariate datasets, asymmetric structures in the data ask for more realistic assumptions, with respect to the incredibly useful paradigm given by the Gaussian distribution. Moreover, in performing ML estimation we know that a few outliers in the data can affect the estimation, hence providing unreliable inference. Challenged by such issues, more flexible and solid tools for modeling heterogeneous skew data are needed. Our fuzzy clustering method is based on mixtures of Skew Gaussian components, endowed by the joint usage of impartial trimming and constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters has been developed, also with the help of some heuristic tools for their choice. Finally, synthetic and real dataset are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.
- Published
- 2018
10. Robust fuzzy and parsimonious clustering based on mixtures of Factor Analyzers
- Author
-
Garcìa-Escudero, L, Greselin, F, Iscar, A, Garcìa-Escudero, LA, Iscar, AM, Garcìa-Escudero, L, Greselin, F, Iscar, A, Garcìa-Escudero, LA, and Iscar, AM
- Abstract
A clustering algorithm that combines the advantages of fuzzy clustering and robust statistical estimators is presented. It is based on mixtures of Factor Analyzers, endowed by the joint usage of trimming and the constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The adoption of clusters modeled by Gaussian Factor Analysis allows for dimension reduction and for discovering local linear structures in the data. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters, such as the trimming level, the fuzzifier parameter, the number of clusters and the value of the scatter matrices constraint, has been developed, also with the help of some heuristic tools for their choice. Finally, a real data set has been analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way
- Published
- 2018
11. Eigenvalues and constraints in mixture modeling: Geometric and computational issues
- Author
-
Garcìa-escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, Mayo-iscar, A, Garcìa-escudero, La, Mayo-iscar, A., Garcìa-escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, Mayo-iscar, A, Garcìa-escudero, La, and Mayo-iscar, A.
- Abstract
This paper presents a review about the usage of eigenvalues restrictions for constrained parameter estimation in mixtures of elliptical distributions according to the likelihood approach. The restrictions serve a twofold purpose: to avoid convergence to degenerate solutions and to reduce the onset of non interesting (spurious) local maximizers, related to complex likelihood surfaces. The paper shows how the constraints may play a key role in the theory of Euclidean data clustering. The aim here is to provide a reasoned survey of the constraints and their applications, considering the contributions of many authors and spanning the literature of the last 30 years.
- Published
- 2018
12. Robust estimation for mixtures of skew data
- Author
-
Garcìa-Escudero, L, Greselin, F, McLachlan, G, Mayo-Iscar, A, Garcìa-Escudero, L, Greselin, F, Mclachlan, G, and Mayo-Iscar, A
- Subjects
SECS-S/01 - STATISTICA ,Skew data, Heterogeneity, mixture models, robust estimation, constrained estimation, trimming, Maximum likelihood, Expectation-maximization - Abstract
Recently, observed departures from the classical Gaussian mixture model in real datasets have led to the introduction of more flexible tools for modeling heterogeneous skew data. Among the latest proposals in the literature, we consider mixtures of skew normal, to incorporate asymmetry in components, as well as mixtures of t, to down-weight the contribution of extremal observations. Clearly, mixtures of skew t have widened the application of model based clustering and classification to great many real datasets, as they can adapt to both asymmetry and leptokurtosis in the grouped data. Unfortunately, when data contamination occurs far from the bulk of the data, or even between the groups, classical inference for these models is not reliable. Our proposal is to address robust estimation of mixtures of skew normal, to resist sparse outliers and even pointwise contamination that could arise in data collection. We introduce a constructive way to obtain a robust estimator for the mixture of skew normal model, by incorporating impartial trimming and constraints in the EM algorithm. At each E-step, a low percentage of less plausible observations, under the estimated model, is tentatively trimmed; at the M-step, constraints on the scatter matrices are employed to avoid singularities and reduce spurious maximizers. Some applications on artificial and real data show the effectiveness of our proposal, and the joint role of trimming and constraints to achieve robustness
- Published
- 2015
13. Robust clustering for heterogeneous skew data
- Author
-
Mola, F, Conversano, C, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, LA, Mayo-Iscar, A., Mola, F, Conversano, C, Garcìa-Escudero, L, Greselin, F, Mayo-Iscar, A, Garcìa-Escudero, LA, and Mayo-Iscar, A.
- Abstract
The existing robust methods for model-based classification and clustering deal with elliptically contoured components. Here we introduce robust estimation for mixtures of skew-normal, by the joint usage of trimming and constraints. The model allows to fit heterogeneous skew data with great flexibility.
- Published
- 2015
14. Robust estimation for mixtures of Gaussian factor analyzers
- Author
-
Gijbels, I, Hubert, M, Park, BU, Welsch, R, Garcìa Escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, Mayo Iscar, A, Mayo Iscar, A., Gijbels, I, Hubert, M, Park, BU, Welsch, R, Garcìa Escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, Mayo Iscar, A, and Mayo Iscar, A.
- Abstract
Mixtures of Gaussian factors are powerful tools for modeling an unobserved heterogeneous population, offering at the same time dimension reduction and model-based clustering. Unfortunately, the high prevalence of spurious solutions and the disturbing effects of outlying observations, along maximum likelihood estimation, open serious issues. We consider restrictions for the component covariances, to avoid spurious solutions, and trimming, to provide robustness against violations of normality assumptions of the underlying latent factors. A detailed AECM algorithm for this new approach is presented. Simulation results and an application to the AIS dataset show the aim and effectiveness of the proposed methodology
- Published
- 2015
15. Fuzzy clustering of multivariate skew data
- Author
-
Garcìa-Escudero, LA, Greselin, F, Mayo-Iscar, A, Colubi, A, Gatu, C, Garcìa-Escudero, L, Greselin, F, and Mayo-Iscar, A
- Subjects
fuzzy clustering, skew data, robust statistics ,SECS-S/01 - STATISTICA - Abstract
With the increasing availability of multivariate datasets, asymmetric structures in the data ask for more realistic assumptions, with respect to the incredibly useful paradigm given by the Gaussian distribution. Moreover, in performing ML estimation we know that a few outliers in the data can affect the estimation, hence providing unreliable inference. Challenged by such issues, more flexible and solid tools for modeling heterogeneous skew data are needed. Our fuzzy clustering method is based on mixtures of Skew Gaussian components, endowed by the joint usage of impartial trimming and constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters has been developed, also with the help of some heuristic tools for their choice. Finally, synthetic and real dataset are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.
- Published
- 2018
16. Robust, fuzzy, and parsimonious clustering based on mixtures of Factor Analyzers
- Author
-
Agustín Mayo Iscar, Luis Angel García-Escudero, Francesca Greselin, Garcìa-Escudero, L, Greselin, F, and Iscar, A
- Subjects
Fuzzy clustering ,Computer science ,Gaussian ,Outliers identification ,02 engineering and technology ,Unsupervised learning ,01 natural sciences ,Fuzzy logic ,Theoretical Computer Science ,Set (abstract data type) ,010104 statistics & probability ,symbols.namesake ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Robust clustering ,0101 mathematics ,Cluster analysis ,Applied Mathematics ,Dimensionality reduction ,Estimator ,Data set ,ComputingMethodologies_PATTERNRECOGNITION ,SECS-S/01 - STATISTICA ,Dimension reduction ,symbols ,020201 artificial intelligence & image processing ,Factor analysis, Hard contrast ,Algorithm ,Software - Abstract
A clustering algorithm that combines the advantages of fuzzy clustering and robust statistical estimators is presented. It is based on mixtures of Factor Analyzers, endowed by the joint usage of trimming and the constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The adoption of clusters modeled by Gaussian Factor Analysis allows for dimension reduction and for discovering local linear structures in the data. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters, such as the trimming level, the fuzzifier parameter, the number of clusters and the value of the scatter matrices constraint, has been developed, also with the help of some heuristic tools for their choice. Finally, a real data set has been analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way., Ministerio de Economía y Competitividad grant MTM2017-86061-C2-1-P, y Consejería de Educación de la Junta de Castilla y León and FEDER grantVA005P17 y VA002G18
- Published
- 2018
- Full Text
- View/download PDF
17. Extending robust fuzzy clustering to skew data
- Author
-
Garcìa-Escudero, LA, Greselin, F, Mayo-Iscar, A, Ana Colubi, Erricos J. Kontoghiorghes and Herman K. Van Dijk, Garcìa-Escudero, L, Greselin, F, and Mayo-Iscar, A
- Subjects
SECS-S/01 - STATISTICA ,Fuzzy clustering, skew components, skew Gaussian distribution, impartial trimming, outliers, robust statistics, robust inference - Abstract
Clustering is an important technique in exploratory data analysis, with applications in image processing, object classification, target recognition, data mining etc. The aim is to partition data according to natural classes present in it, assigning data points that are more similar to the same cluster. We solved this ill-posed problem by adopting a fuzzy clustering method, based on mixtures of skew Gaussian, endowed by the joint usage of trimming and constrained estimation of scatter matrices. A set of membership values are used to fuzzy partition the data and to contribute to the robust estimates of the mixture parameters. The purpose is to adopt the basic skew Gaussian component for the mixture and apply impartial trimming to the data, to model the skew core of the clusters and to adapt to any type of tail behaviour. The choice of the skew Gaussian components is motivated by the fact that, with the increased availability of multivariate datasets, often underlying asymmetric structures appear. In these cases, the extremely useful paradigm for clustering given by the mixtures of Gaussian distributions appeared somehow unrealistic. Moreover, impartial trimming provides robust ML estimation, even in presence of outliers in the data. Finally, synthetic and real data are analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way.
- Published
- 2018
18. Eigenvalues and constraints in mixture modeling: geometric and computational issues
- Author
-
Francesca Greselin, Agustín Mayo-Iscar, Luis Angel García-Escudero, Salvatore Ingrassia, Alfonso Gordaliza, Garcìa-escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, and Mayo-iscar, A
- Subjects
Statistics and Probability ,Mathematical optimization ,Eigenvalue ,01 natural sciences ,Eigenvalues ,EM algorithm ,Mixture model ,Model-based clustering ,Computer Science Applications1707 Computer Vision and Pattern Recognition ,Applied Mathematics ,010104 statistics & probability ,0502 economics and business ,Expectation–maximization algorithm ,Convergence (routing) ,Euclidean geometry ,0101 mathematics ,Spurious relationship ,Cluster analysis ,050205 econometrics ,Mathematics ,Estimation theory ,05 social sciences ,Constrained clustering ,Computer Science Applications ,SECS-S/01 - STATISTICA - Abstract
This paper presents a review about the usage of eigenvalues restrictions for constrained parameter estimation in mixtures of elliptical distributions according to the likelihood approach. These restrictions serve a twofold purpose: to avoid convergence to degenerate solutions and to reduce the onset of non interesting (spurious) maximizers, related to complex likelihood surfaces. The paper shows how the constraints may play a key role in the theory of Euclidean data clustering. The aim here is to provide a reasoned review of the constraints and their applications, along the contributions of many authors, spanning the literature of the last thirty years., Spanish Ministerio de Economía y Competitividad (grant MTM2017-86061-C2-1-P), Junta de Castilla y León - Fondo Europeo de Desarrollo Regional (grant VA005P17 and VA002G18)
- Published
- 2018
- Full Text
- View/download PDF
19. Robust clustering for heterogeneous skew data
- Author
-
Garcìa-Escudero, LA, Greselin, F, Mayo-Iscar, A., Mola, F, Conversano, C, Garcìa-Escudero, L, Greselin, F, and Mayo-Iscar, A
- Subjects
Clustering, robust estimation, skew data ,SECS-S/01 - STATISTICA - Abstract
The existing robust methods for model-based classification and clustering deal with elliptically contoured components. Here we introduce robust estimation for mixtures of skew-normal, by the joint usage of trimming and constraints. The model allows to fit heterogeneous skew data with great flexibility.
- Published
- 2015
20. Robust estimation for mixtures of Gaussian factor analyzers
- Author
-
Garcia Escudero, L., Gordaliza, A., Greselin, F., Ingrassia, Salvatore, Mayo Iscar, A., Gijbels, I, Hubert, M, Park, BU, Welsch, R, Garcìa Escudero, L, Gordaliza, A, Greselin, F, Ingrassia, S, and Mayo Iscar, A
- Subjects
SECS-S/01 - STATISTICA ,Trimming, Factor analysis, Mixture Models, EM, Robust estimation, Constrained estimation, Dimension reduction - Abstract
Mixtures of Gaussian factors are powerful tools for modeling an unobserved heterogeneous population, offering at the same time dimension reduction and model-based clustering. Unfortunately, the high prevalence of spurious solutions and the disturbing effects of outlying observations, along maximum likelihood estimation, open serious issues. We consider restrictions for the component covariances, to avoid spurious solutions, and trimming, to provide robustness against violations of normality assumptions of the underlying latent factors. A detailed AECM algorithm for this new approach is presented. Simulation results and an application to the AIS dataset show the aim and effectiveness of the proposed methodology
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.