Descriptor: "62J07" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"62J07"' showing total 1,069 results

Start Over Descriptor "62J07"

1,069 results on '"62J07"'

1. Sparse Linear Regression when Noises and Covariates are Heavy-Tailed and Contaminated by Outliers

Author: Sasai, Takeyuki and Fujisawa, Hironori
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, 62J07
Abstract: We investigate a problem estimating coefficients of linear regression under sparsity assumption when covariates and noises are sampled from heavy tailed distributions. Additionally, we consider the situation where not only covariates and noises are sampled from heavy tailed distributions but also contaminated by outliers. Our estimators can be computed efficiently, and exhibit sharp error bounds., Comment: This research builds on and improves the results of arxiv:2206.07594. There will be no further update for the earlier manuscript
Published: 2024

2. $\ell_1$-Regularized Generalized Least Squares

Author: Nobari, Kaveh S. and Gibberd, Alex
Subjects: Statistics - Methodology, Mathematics - Statistics Theory, Statistics - Machine Learning, 62J07
Abstract: In this paper we propose an $\ell_1$-regularized GLS estimator for high-dimensional regressions with potentially autocorrelated errors. We establish non-asymptotic oracle inequalities for estimation accuracy in a framework that allows for highly persistent autoregressive errors. In practice, the Whitening matrix required to implement the GLS is unkown, we present a feasible estimator for this matrix, derive consistency results and ultimately show how our proposed feasible GLS can recover closely the optimal performance (as if the errors were a white noise) of the LASSO. A simulation study verifies the performance of the proposed method, demonstrating that the penalized (feasible) GLS-LASSO estimator performs on par with the LASSO in the case of white noise errors, whilst outperforming it in terms of sign-recovery and estimation error when the errors exhibit significant correlation., Comment: 13 pages, 6 figures
Published: 2024

3. Adaptive Ridge Approach to Heteroscedastic Regression

Author: Ho, Ka Long Keith and Masuda, Hiroki
Subjects: Mathematics - Statistics Theory, 62J07
Abstract: We propose an adaptive ridge (AR) based estimation scheme for a heteroscedastic linear model equipped with log-linear errors. We simultaneously estimate the mean and variance parameters and show new asymptotic distributional and tightness properties in a sparse setting. We also show that estimates for zero parameters shrink with more iterations under suitable assumptions for tuning parameters. We observe possible generalizations of this paper's results through simulations and will apply the estimation method in forecasting electricity consumption., Comment: 25 pages, 3 tables, 7 figures
Published: 2024

4. Efficient Sparse Least Absolute Deviation Regression with Differential Privacy

Author: Liu, Weidong, Mao, Xiaojun, Zhang, Xiaofei, and Zhang, Xin
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Methodology, 62J07
Abstract: In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(\epsilon,\delta)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm., Comment: IEEE Transactions on Information Forensics and Security, 2024
Published: 2024
Full Text: View/download PDF

5. Modified ridge estimator in the Bell regression model.

Author: Bulut, Y. Murat, Lukman, Adewale F., Işılar, Melike, Adewuyi, Emmanuel T., and Algamal, Zakariya Y.
Abstract: The Bell regression model (BRM), a member of the generalized linear models (GLMs), can be used when the dependent variable consists of overdispersed count data. The maximum likelihood estimator (MLE) is generally used to estimate unknown regression coefficients. The major drawback of the MLE is an inflated variance when multicollinearity problems occur. In this study, we proposed a new biased estimator to cope with the multicollinearity in the BRM. The simulation study is conducted to illustrate the performance of the proposed estimator over the MLE and Bell ridge estimator (BRE). Also, we give real data examples to approve the applicability of the proposed estimator in real data problems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. High-dimensional semiparametric mixed-effects model for longitudinal data with non-normal errors.

Author: Taavoni, Mozhgan and Arashi, Mohammad
Subjects: *COVARIANCE matrices, *GAUSSIAN distribution, *SAMPLE size (Statistics), *DATA modeling, *GENERALIZED estimating equations
Abstract: Difficulties may arise when analyzing longitudinal data using mixed-effects models if nonparametric functions are present in the linear predictor component. This study extends semiparametric mixed-effects modeling in cases when the response variable does not always follow a normal distribution and the nonparametric component is structured as an additive model. A novel approach is proposed to identify significant linear and non-linear components using a double-penalized generalized estimating equation with two penalty terms. Furthermore, the iterative approach intends to enhance the efficiency of estimating regression coefficients by incorporating the calculation of the working covariance matrix. The oracle properties of the resulting estimators are established under certain regularity conditions, where the dimensions of both the parametric and nonparametric components increase as the sample size grows. We perform numerical studies to demonstrate the efficacy of our proposal. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Jackknife Kibria-Lukman estimator for the beta regression model.

Author: Koç, Tuba and Dünder, Emre
Subjects: *MAXIMUM likelihood statistics, *REGRESSION analysis, *MULTICOLLINEARITY, *DEPENDENT variables, *POCKETKNIVES, *PERCENTILES
Abstract: The beta regression model is a flexible model, which widely used when the dependent variable is in ratios and percentages in the range of (0.1). The coefficients of the beta regression model are estimated using the maximum likelihood method. In cases where there is a multicollinearity problem, the use of maximum likelihood (ML) leads to problems such as inconsistent parameter estimates and inflated variance.In the presence of multicollinearity, the use of maximum likelihood (ML) leads to problems such as inconsistent parameter estimates and inflated variance. In this study, KL estimator and its jackknifed version are proposed to reduce the effects of multicollinearity in the beta regression model. The performance of the proposed jackknifed KL beta regression estimator is compared with ridge, Liu and KL estimators through simulation studies and real data applications. The results show that the proposed estimators mostly outperform ML, ridge, Liu and KL estimators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Identifying a class of Ridge-type estimators in binary logistic regression models.

Author: Ertan, Esra and Akay, Kadri Ulaş
Subjects: *REGRESSION analysis, *MONTE Carlo method, *MAXIMUM likelihood statistics, *MINI-Mental State Examination, *LOGISTIC regression analysis, *MULTICOLLINEARITY
Abstract: In the analysis of logistic regression models, various biased estimators have been proposed as an alternative to the maximum likelihood estimator (MLE) for estimating model parameters in the presence of multicollinearity. In this study, a new class of biased estimators called Logistic Ridge-type Estimator (LRTE) is proposed by generalizing the existing biased estimators that include two biasing parameters. The performance of the proposed estimator is compared with the other biased estimators in terms of the Matrix Mean Squared Error (MMSE). Two separate Monte Carlo simulation studies are conducted to investigate the performance of the proposed estimator. A numerical example is provided to demonstrate the performance of the proposed biased estimator. The results revealed that LRTE performed better than other existing biased estimators under the conditions investigated in this study. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Lasso and elastic nets by orthants

Author: Maruri-Aguilar, Hugo
Subjects: Statistics - Methodology, Mathematics - Statistics Theory, Statistics - Computation, 62J07
Abstract: We propose a new method for computing the lasso path, using the fact that the Manhattan norm of the coefficient vector is linear over every orthant of the parameter space. We use simple calculus and present an algorithm in which the lasso path is series of orthant moves. Our proposal gives the same results as standard literature, with the advantage of neat interpretation of results and explicit lasso formul{\ae}. We extend this proposal to elastic nets and obtain explicit, exact formul{\ae} for the elastic net path, and with a simple change, our lasso algorithm can be used for elastic nets. We present computational examples and provide simple R prototype code., Comment: 44 pages, 10 figures, 3 tables
Published: 2023

10. Estimation of sparse linear regression coefficients under $L$-subexponential covariates

Author: Sasai, Takeyuki
Subjects: Mathematics - Statistics Theory, Statistics - Machine Learning, 62J07
Abstract: We tackle estimating sparse coefficients in a linear regression when the covariates are sampled from an $L$-subexponential random vector. This vector belongs to a class of distributions that exhibit heavier tails than Gaussian random vector. Previous studies have established error bounds similar to those derived for Gaussian random vectors. However, these methods require stronger conditions than those used for Gaussian random vectors to derive the error bounds. In this study, we present an error bound identical to the one obtained for Gaussian random vectors up to constant factors without imposing stronger conditions, when the covariates are drawn from an $L$-subexponential random vector. Interestingly, we employ an $\ell_1$-penalized Huber regression, which is known for its robustness against heavy-tailed random noises rather than covariates. We believe that this study uncovers a new aspect of the $\ell_1$-penalized Huber regression method.
Published: 2023

11. Consistent ridge estimation for replicated ultrastructural measurement error models.

Author: Üstündağ Şiray, Gülesen
Subjects: *MULTICOLLINEARITY, *MEASUREMENT errors, *ERRORS-in-variables models
Abstract: The presence of measurement errors in data and multicollinearity among the explanatory variables have negative effects on the estimation of regression coefficients. Within this respect, the motivation of this article is to examine measurement errors and multicollinearity problems simultaneously. In this paper, by utilizing three different forms of corrected score functions three consistent ridge regression estimators are proposed. Theoretical comparisons of these new estimators are examined by implementing the mean squared error criterion. Large sample properties of these estimators are investigated without assuming any distributional assumption. Two numerical examples are presented using real data sets and also a simulation study is performed. The findings indicate that the newly proposed three estimators outperform the existing estimators by the criterion of mean squared error. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. A new general biased estimator in linear measurement error model.

Author: Goyal, Pragya, Tiwari, Manoj K., Bist, Vikas, and Ababneh, Faisal
Subjects: *ERRORS-in-variables models, *MONTE Carlo method, *LENGTH measurement, *MULTICOLLINEARITY, *MEASUREMENT errors, *COMPUTER simulation
Abstract: AbstractNumerous biased estimators are known to circumvent the multicollinearity problem in linear measurement error models. This article proposes a general biased estimator with the ridge regression and the Liu estimators as special cases. The efficiency of the suggested estimator is compared with ridge regression and Liu estimators under the mean squared error matrix criterion. In addition, a Monte Carlo simulation study and a numerical evaluation have been conducted to elucidate the superiority of the new general biased estimator over other estimators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Improved estimators in bell regression model with application.

Author: Seifollahi, Solmaz, Bevrani, Hossein, and Algamal, Zakariya Yahya
Abstract: In this paper, we propose the application of shrinkage strategies to estimate coefficients in the Bell regression models when prior information about the coefficients is available. The Bell regression models are well-suited for modelling count data with multiple covariates. Furthermore, we provide a detailed explanation of the asymptotic properties of the proposed estimators, including asymptotic biases and mean squared errors. To assess the performance of the estimators, we conduct numerical studies using Monte Carlo simulations and evaluate their simulated relative efficiency. The results demonstrate that the suggested estimators outperform the unrestricted estimator when prior information is taken into account. Additionally, we present an empirical application to demonstrate the practical utility of the suggested estimators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Linear mixed model selection via minimum approximated information criterion.

Author: Atutey, Olivia and Shang, Junfeng
Subjects: *DATA modeling, *MOTIVATION (Psychology)
Abstract: Linear mixed models (LMMs) are modeled using both fixed effects and random effects for correlated data. The random intercepts (RI) and random intercepts and slopes (RIS) models are two exceptional cases from linear mixed models. Our primary focus is to propose an approach for simultaneous selection and estimation in linear mixed models. We design a penalized log-likelihood procedure referred to as the minimum approximated information criteria for LMMs (lmmMAIC), which is utilized to select the most appropriate model for generalizing the data. The proposed lmmMAIC procedure for variable selection and estimation can estimate the parameters while shrinking the unimportant fixed effects estimates to zero and is mainly motivated by both regularized methods and information criteria. This procedure enforces variable selection and sparse estimation simultaneously by adding a penalty term to the negative log-likelihood of linear mixed models. The method differs from existing regularized methods mainly due to the penalty parameter and the penalty function which is approximated L0 norm with unit dent. A simulation study is performed to demonstrate the effectiveness of the lmmMAIC method, and the simulation results show that the lmmMAIC method outperforms some other penalized methods in estimation and model selection by the comparison of the simulation results. The proposed method is also applied in Riboflavin data for the illustration of its behavior. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Liu-type shrinkage strategies in zero-inflated negative binomial models with application to Expenditure and Default Data.

Author: Zandi, Zahra, Arabi Belaghi, Reza, and Bevrani, Hossein
Subjects: *MONTE Carlo method, *INDEPENDENT variables, *MAXIMUM likelihood statistics, *REGRESSION analysis, *MULTICOLLINEARITY, *DEFAULT (Finance)
Abstract: In modeling count data with overdispersion and extra zeros, zero-inflated negative binomial (ZINB) regression model is useful. In a regression model, the multicollinearity problem arises when there are some high correlations between predictor variables. This problem leads to the maximum likelihood method will not be an efficient estimator. The ridge and Liu-type estimators have been proposed to combat the multicollinearity problem so that the Liu-type estimator is better. In this paper, we proposed the Liu-type shrinkage estimators, namely linear shrinkage, preliminary test, shrinkage preliminary test, Stein-type, and positive Stein-type Liu estimators to estimate the count parameters in the ZINB model, when some of the predictor variables have not a significant effect to predict the response variable so that a sub-model may be sufficient. The asymptotic distributional biases and variances of the proposed estimators are nicely demonstrated. We also compared the performance of the Liu-type shrinkage estimators along with the Liu-type unrestricted estimator by using an extensive Monte Carlo simulation study. The results show that the performances of the proposed estimators are superior to those based on Liu-type unrestricted estimators. We also applied the proposed estimation methods to Expenditure and Default Data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Smoothed empirical likelihood estimation and automatic variable selection for an expectile high-dimensional model.

Author: Ciuperca, Gabriela
Subjects: *DISTRIBUTION (Probability theory), *CHI-square distribution, *ASYMPTOTIC normality, *ALGORITHMS
Abstract: AbstractWe consider a linear model which can have a large number of explanatory variables, the errors with an asymmetric distribution or the values of the explained variable are missing at random. In order to take in account these several situations, we consider the non parametric empirical likelihood (EL) estimation method. Because a constraint in EL contains an indicator function then a smoothed function instead of the indicator will be considered. Two smoothed expectile maximum EL methods are proposed, one of which will automatically select the explanatory variables. For each of the methods we obtain the convergence rate of the estimators and their asymptotic normality. The smoothed expectile empirical log-likelihood ratio process follow asymptotically a chi-square distribution and moreover the adaptive LASSO smoothed expectile maximum EL estimator satisfies the sparsity property which guarantees the automatic selection of zero model coefficients. In order to implement these methods, we propose four algorithms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. A New Modified Generalized Two Parameter Estimator for linear regression model.

Author: Sidhu, Bavneet Kaur, Tiwari, Manoj Kumar, Bist, Vikas, Kumar, Manoj, and Pathak, Anurag
Subjects: *LEAST squares, *MONTE Carlo method, *REGRESSION analysis, *MULTICOLLINEARITY, *PARAMETER estimation
Abstract: AbstractThe Ordinary Least Squares estimator estimates the parameter vectors in a linear regression model. However, it gives misleading results when the input variables are highly correlated, emanating the issue of multicollinearity. In light of multicollinearity, we wish to obtain more accurate estimators of the regression coefficients than the least square estimators. The main problem of least square estimation is to tackle multicollinearity so as to get more accurate estimates. In this paper, we introduce a New Modified Generalized Two Parameter Estimator by merging the Generalized Two Parameter Estimator and the Modified Two Parameter Estimator and compare it with other known estimators like Ordinary Least Squares Estimator, Ridge Regression Estimator, Liu estimator, Modified Ridge Estimator, Modified Liu Estimator and Modified Two Parameter Estimator. Mean Squared Error Matrix criterion was used to compare the new estimator over existing estimators. The estimation of the biased parameters is discussed. Necessary and sufficient conditions are derived to compare the proposed estimator with the existing estimators. The excellence of the new estimator over existing estimators is illustrated with the help of real data set and a Monte Carlo simulation study. The results indicate that the newly developed estimator is more efficient as it has lower mean square error. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Marginalized LASSO in the low-dimensional difference-based partially linear model for variable selection.

Author: Norouzirad, M., Moura, R., Arashi, M., and Marques, F. J.
Subjects: *REGRESSION analysis, *NUMERICAL analysis
Abstract: The difference-based partially linear model is an appropriate regression model when both linear and nonlinear predictors are present in the data. However, when we want to optimize the weights using the difference-based method, the problem of variable selection can be difficult since low-variance predictors present a challenge. Therefore, this study aims to establish a novel methodology based on marginal theory to tackle such mixed relationships successfully, emphasizing variable selection in low dimensions. We suggest using a marginalized LASSO estimator with a penalty term that is not as severe and related to the difference order. As part of our numerical analysis of small sample performance, we undertake comprehensive simulation experiments to numerically demonstrate the strength of our proposed technique in estimation and prediction compared to the LASSO under a low-dimensional setup. This is done so that we can numerically demonstrate the strength of our proposed method in estimation and prediction. The bootstrapped method is utilized to evaluate how well our proposed prediction method performs when examining the King House dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Shrinkage estimation in the zero-inflated Poisson regression model with right-censored data.

Author: Zandi, Zahra, Bevrani, Hossein, and Belaghi, Reza Arabi
Subjects: *REGRESSION analysis, *MONTE Carlo method, *POISSON regression, *PARAMETER estimation, *DATA modeling
Abstract: In this article, we improve parameter estimation in the zero-inflated Poisson regression model using shrinkage strategies when it is suspected that the regression parameter vector may be restricted to a linear subspace. We consider a situation where the response variable is subject to right-censoring. We develop the asymptotic distributional biases and risks of the shrinkage estimators. We conduct an extensive Monte Carlo simulation for various combinations of the inactive predictors and censoring constants to compare the performance of the proposed estimators in terms of their simulated relative efficiencies. The results demonstrate that the shrinkage estimators outperform the classical estimator in certain parts of the parameter space. When there are many inactive predictors in the model, as well as when the censoring percentage is low, the proposed estimators perform better. The performance of the positive Stein-type estimator is superior to the Stein-type estimator in certain parts of the parameter space. We evaluated the estimators' performance using wildlife fish data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. An almost unbiased Liu-type estimator in the linear regression model.

Author: Erdugan, Funda
Subjects: *REGRESSION analysis, *MULTICOLLINEARITY, *MONTE Carlo method, *LEAST squares
Abstract: A biased estimator, compared to least squares estimators, is one of the most used statistical procedures to overcome the problem of multicollinearity. Liu-type estimators, which are biased estimators, are preferred in a wide range of fields. In this article, we propose an almost unbiased Liu-type (AUNL) estimator and discuss its performance under the mean square error matrix criterion among existing estimators. The proposed AUNL estimator is a general estimator and is based on the function of a single biasing parameter. It includes an ordinary least squares estimator, an almost unbiased ridge estimator, an almost unbiased Liu estimator, and an almost unbiased two-parameter estimator. Finally, real data examples and a Monte Carlo simulation are provided to illustrate the theoretical results. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Predicting dichotomised outcomes from high-dimensional data in biomedicine.

Author: Rauschenberger, Armin and Glaab, Enrico
Subjects: *STATISTICAL models, *FORECASTING, *RESEARCH personnel, *COGNITION disorders, *DISEASE progression, *LOGISTIC regression analysis
Abstract: In many biomedical applications, we are more interested in the predicted probability that a numerical outcome is above a threshold than in the predicted value of the outcome. For example, it might be known that antibody levels above a certain threshold provide immunity against a disease, or a threshold for a disease severity score might reflect conversion from the presymptomatic to the symptomatic disease stage. Accordingly, biomedical researchers often convert numerical to binary outcomes (loss of information) to conduct logistic regression (probabilistic interpretation). We address this bad statistical practice by modelling the binary outcome with logistic regression, modelling the numerical outcome with linear regression, transforming the predicted values from linear regression to predicted probabilities, and combining the predicted probabilities from logistic and linear regression. Analysing high-dimensional simulated and experimental data, namely clinical data for predicting cognitive impairment, we obtain significantly improved predictions of dichotomised outcomes. Thus, the proposed approach effectively combines binary with numerical outcomes to improve binary classification in high-dimensional settings. An implementation is available in the R package cornet on GitHub () and CRAN (). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. A Generalized Formulation for Group Selection via ADMM.

Author: Ke, Chengyu, Shin, Sunyoung, Lou, Yifei, and Ahn, Miju
Abstract: This paper studies a statistical learning model where the model coefficients have a pre-determined non-overlapping group sparsity structure. We consider a combination of a loss function and a regularizer to recover the desired group sparsity patterns, which can embrace many existing works. We analyze directional stationary solutions of the proposed formulation, obtaining a sufficient condition for a directional stationary solution to achieve optimality and establishing a bound of the distance from the solution to a reference point. We develop an efficient algorithm that adopts an alternating direction method of multiplier (ADMM), showing that the iterates converge to a directional stationary solution under certain conditions. In the numerical experiment, we implement the algorithm for generalized linear models with convex and nonconvex group regularizers to evaluate the model performance on various data types, noise levels, and sparsity settings. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. FDR control and power analysis for high-dimensional logistic regression via StabKoff.

Author: Yuan, Panxu, Kong, Yinfei, and Li, Gaorong
Subjects: OPIOID abuse, LOGISTIC regression analysis, REGRESSION analysis, MACHINE learning, FALSE discovery rate
Abstract: Identifying significant variables for the high-dimensional logistic regression model is a fundamental problem in modern statistics and machine learning. This paper introduces a stability knockoffs (StabKoff) selection procedure by merging stability selection and knockoffs to conduct controlled variable selection for logistic regression. Under some regularity conditions, we show that the proposed method achieves FDR control under the finite-sample setting, and the power also asymptotically approaches one as the sample size tends to infinity. In addition, we further develop an intersection strategy that allows better separation of knockoff statistics between significant and unimportant variables, which in some cases leads to an increase in power. The simulation studies demonstrate that the proposed method possesses satisfactory finite-sample performance compared with existing methods in terms of both FDR and power. We also apply the proposed method to a real data set on opioid use disorder treatment. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Robust Distributional Regression with Automatic Variable Selection

Author: O'Neill, Meadhbh and Burke, Kevin
Subjects: Statistics - Methodology, 62J07
Abstract: Datasets with extreme observations and/or heavy-tailed error distributions are commonly encountered and should be analyzed with careful consideration of these features from a statistical perspective. Small deviations from an assumed model, such as the presence of outliers, can cause classical regression procedures to break down, potentially leading to unreliable inferences. Other distributional deviations, such as heteroscedasticity, can be handled by going beyond the mean and modelling the scale parameter in terms of covariates. We propose a method that accounts for heavy tails and heteroscedasticity through the use of a generalized normal distribution (GND). The GND contains a kurtosis-characterizing shape parameter that moves the model smoothly between the normal distribution and the heavier-tailed Laplace distribution - thus covering both classical and robust regression. A key component of statistical inference is determining the set of covariates that influence the response variable. While correctly accounting for kurtosis and heteroscedasticity is crucial to this endeavour, a procedure for variable selection is still required. For this purpose, we use a novel penalized estimation procedure that avoids the typical computationally demanding grid search for tuning parameters. This is particularly valuable in the distributional regression setting where the location and scale parameters depend on covariates, since the standard approach would have multiple tuning parameters (one for each distributional parameter). We achieve this by using a "smooth information criterion" that can be optimized directly, where the tuning parameters are fixed at log(n) in the BIC case.
Published: 2022

25. Robust multi-outcome regression with correlated covariate blocks using fused LAD-lasso

Author: Möttönen, Jyrki, Lähderanta, Tero, Salonen, Janne, and Sillanpää, Mikko J.
Subjects: Statistics - Methodology, 62J07
Abstract: Lasso is a popular and efficient approach to simultaneous estimation and variable selection in high-dimensional regression models. In this paper, a robust LAD-lasso method for multiple outcomes is presented that addresses the challenges of non-normal outcome distributions and outlying observations. Measured covariate data from space or time, or spectral bands or genomic positions often have natural correlation structure arising from measuring distance between the covariates. The proposed multi-outcome approach includes handling of such covariate blocks by a group fusion penalty, which encourages similarity between neighboring regression coefficient vectors by penalizing their differences for example in sequential data situation. Properties of the proposed approach are first illustrated by extensive simulations, and secondly the method is applied to a real-life skewed data example on retirement behavior with heteroscedastic explanatory variables.
Published: 2022

26. Bayesian stein-type shrinkage estimators in high-dimensional linear regression models

Author: Zanboori, Ahmadreza, Zanboori, Ehsan, Mousavi, Maryam, and Mirjalili, Sayyed Mahmoud
Published: 2024
Full Text: View/download PDF

27. Linear Convergence of ISTA and FISTA

Author: Li, Bo-Wen, Shi, Bin, and Yuan, Ya-Xiang
Published: 2024
Full Text: View/download PDF

28. Penalization-induced shrinking without rotation in high dimensional GLM regression: a cavity analysis

Author: Massa, Emanuele, Jonker, Marianne, and Coolen, Anthony
Subjects: Mathematics - Statistics Theory, Condensed Matter - Disordered Systems and Neural Networks, 62J07
Abstract: In high dimensional regression, where the number of covariates is of the order of the number of observations, ridge penalization is often used as a remedy against overfitting. Unfortunately, for correlated covariates such regularisation typically induces in generalized linear models not only shrinking of the estimated parameter vector, but also an unwanted \emph{rotation} relative to the true vector. We show analytically how this problem can be removed by using a generalization of ridge penalization, and we analyse the asymptotic properties of the corresponding estimators in the high dimensional regime, using the cavity method. Our results also provide a quantitative rationale for tuning the parameter that controlling the amount of shrinking. We compare our theoretical predictions with simulated data and find excellent agreement.
Published: 2022
Full Text: View/download PDF

29. On some two parameter estimators for the linear regression models with correlated predictors: simulation and application.

Author: Khan, Muhammad Shakir, Ali, Amjad, Suhail, Muhammad, and Kibria, B. M. Golam
Subjects: *MULTICOLLINEARITY, *REGRESSION analysis, *INDEPENDENT variables, *LEAST squares, *MONTE Carlo method, *STATISTICAL models
Abstract: AbstractRegression analysis is widely used to predict the response variable utilizing one or more predictor variables. In many fields of study, the predictors are highly correlated causing multicollinearity problem that severely affects the efficiency of ordinary least square (OLS) estimators by significantly inflating their variances. To solve the multicollinearity problem, various one and two parameter ridge estimators are available in literature. In this article, a class of modified two parameter Lipovetsky–Conklin ridge estimators is proposed based on eigen values of X′X matrix that provide an automatic dealing option for treating different levels of multicollinearity. An extensive simulations study followed by real life example is used to evaluate the performance of proposed estimators based on MSE criterion. In most of the simulation conditions, our proposed estimators outperformed the existing estimators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Combining phenotypic and genomic data to improve prediction of binary traits.

Author: Jarquin, D., Roy, A., Clarke, B., and Ghosal, S.
Subjects: *PHENOTYPES, *CULTIVARS, *GENOTYPES, *PLANT breeders, *FORECASTING
Abstract: Plant breeders want to develop cultivars that outperform existing genotypes. Some characteristics (here 'main traits') of these cultivars are categorical and difficult to measure directly. It is important to predict the main trait of newly developed genotypes accurately. In addition to marker data, breeding programs often have information on secondary traits (or 'phenotypes') that are easy to measure. Our goal is to improve prediction of main traits with interpretable relations by combining the two data types using variable selection techniques. However, the genomic characteristics can overwhelm the set of secondary traits, so a standard technique may fail to select any phenotypic variables. We develop a new statistical technique that ensures appropriate representation from both the secondary traits and the genotypic variables for optimal prediction. When two data types (markers and secondary traits) are available, we achieve improved prediction of a binary trait by two steps that are designed to ensure that a significant intrinsic effect of a phenotype is incorporated in the relation before accounting for extra effects of genotypes. First, we sparsely regress the secondary traits on the markers and replace the secondary traits by their residuals to obtain the effects of phenotypic variables as adjusted by the genotypic variables. Then, we develop a sparse logistic classifier using the markers and residuals so that the adjusted phenotypes may be selected first to avoid being overwhelmed by the genotypic variables due to their numerical advantage. This classifier uses forward selection aided by a penalty term and can be computed effectively by a technique called the one-pass method. It compares favorably with other classifiers on simulated and real data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. A note on Farebrother’s estimator: a comparative study.

Author: Kaçıranlar, Selahattin, Mirezi, Buatikan, and Güler, Hüseyin
Abstract: AbstractIn this article, we begin by providing a theoretical comparison between Farebrother’s estimator and the ridge estimator, in terms of the MSE matrix criterion, under three distinct restrictions when r is random. Additionally, we propose an estimate of parameter k for Farebrother’s estimator. Subsequently, we present a Monte Carlo simulation experiment to compare Farebrother’s estimator with OLS, RLS, and ridge estimators. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. A new kernel two-parameter prediction under multicollinearity in partially linear mixed measurement error model.

Author: Yalaz, Seçil and Kuran, Özge
Subjects: *ERRORS-in-variables models, *MONTE Carlo method, *COVARIANCE matrices, *LENGTH measurement, *EARTHQUAKES, *MULTICOLLINEARITY, *MEASUREMENT errors
Abstract: A Partially linear mixed effects model relating a response Y to predictors $ (X,Z,T) $ (X , Z , T) with the mean function $ X^{T}\beta +Zb+g(T) $ X T β + Zb + g (T) is considered in this paper. When the parametric parts' variable X are measured with additive error and there is ill-conditioned data suffering from multicollinearity, a new kernel two-parameter prediction method using the kernel ridge and Liu regression approach is suggested. The kernel two parameter estimator of β and the predictor of b are derived by modifying the likelihood and Henderson methods. Matrix mean square error comparisons are calculated. We also demonstrate that under suitable conditions, the resulting estimator of β is asymptotically normal. The situation with an unknown measurement error covariance matrix is handled. A Monte Carlo simulation study, together with an earthquake data example, is compiled to evaluate the effectiveness of the proposed approach at the end of the paper. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Shrinkage efficiency bounds: An extension.

Author: De Luca, Giuseppe and Magnus, Jan R.
Subjects: *GENERALIZATION
Abstract: Hansen (2005) obtained the efficiency bound (the lowest achievable risk) in the p-dimensional normal location model when p≥3, generalizing an earlier result of Magnus (2002) for the one-dimensional case (p = 1). The classes of estimators considered are, however, different in the two cases. We provide an alternative bound to Hansen's which is a more natural generalization of the one-dimensional case, and we compare the classes and the bounds. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Dictionary-based model reduction for state estimation.

Author: Nouy, Anthony and Pasco, Alexandre
Abstract: We consider the problem of state estimation from a few linear measurements, where the state to recover is an element of the manifold M of solutions of a parameter-dependent equation. The state is estimated using prior knowledge on M coming from model order reduction. Variational approaches based on linear approximation of M , such as PBDW, yield a recovery error limited by the Kolmogorov width of M . To overcome this issue, piecewise-affine approximations of M have also been considered, that consist in using a library of linear spaces among which one is selected by minimizing some distance to M . In this paper, we propose a state estimation method relying on dictionary-based model reduction, where space is selected from a library generated by a dictionary of snapshots, using a distance to the manifold. The selection is performed among a set of candidate spaces obtained from a set of ℓ 1 -regularized least-squares problems. Then, in the framework of parameter-dependent operator equations (or PDEs) with affine parametrizations, we provide an efficient offline-online decomposition based on randomized linear algebra, that ensures efficient and stable computations while preserving theoretical guarantees. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. ROBOUT: a conditional outlier detection methodology for high-dimensional data.

Author: Farnè, Matteo and Vouldis, Angelos
Subjects: OUTLIER detection, EUROZONE, REGRESSION analysis, LEAST squares, MULTICOLLINEARITY, SAMPLE size (Statistics)
Abstract: This paper presents a methodology, called ROBOUT, to identify outliers conditional on a high-dimensional noisy information set. In particular, ROBOUT is able to identify observations with outlying conditional mean or variance when the dataset contains multivariate outliers in or besides the predictors, multi-collinearity, and a large variable dimension compared to the sample size. ROBOUT entails a pre-processing step, a preliminary robust imputation procedure that prevents anomalous instances from corrupting predictor recovery, a selection stage of the statistically relevant predictors (through cross-validated LASSO-penalized Huber loss regression), the estimation of a robust regression model based on the selected predictors (via MM regression), and a criterion to identify conditional outliers. We conduct a comprehensive simulation study in which the proposed algorithm is tested under a wide range of perturbation scenarios. The combination formed by LASSO-penalized Huber loss and MM regression turns out to be the best in terms of conditional outlier detection under the above described perturbed conditions, also compared to existing integrated methodologies like Sparse Least Trimmed Squares and Robust Least Angle Regression. Furthermore, the proposed methodology is applied to a granular supervisory banking dataset collected by the European Central Bank, in order to model the total assets of euro area banks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Variable selection in proportional odds model with informatively interval-censored data.

Author: Zhao, Bo, Wang, Shuying, and Wang, Chunjie
Subjects: FAILURE time data analysis, CENSORING (Statistics), MULTICOLLINEARITY, REGRESSION analysis
Abstract: The proportional odds (PO) model is one of the most commonly used models for regression analysis of failure time data in survival analysis. It assumes that the odds of the failure is proportional to the baseline odds at any point in time given the covariate. The model focus on the situation that the ratio of the hazards converges to unity as time goes to infinity, while the proportional hazards (PH) model has a constant ratio of hazards over time. In the paper, we consider a general type of failure time data, case K interval-censored data, that include case I or case II interval-censored data as special cases. We propose a PO model-based unified penalized variable selection procedure that involves minimizing a negative sieve log-likelihood function plus a broken adaptive ridge penalty, with the initial values obtained from the ridge regression estimator. The proposed approach allows dependent censoring, which occurs quite often and could lead to biased or misleading estimates without considering it. We show that the proposed selection method has oracle properties and the estimator is semiparametrically efficient. The numerical studies suggest that the proposed approach works well for practical situations. In addition, the method is applied to an ADNI study that motivates this investigation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Spike and slab Bayesian sparse principal component analysis.

Author: Ning, Yu-Chien Bo and Ning, Ning
Abstract: Sparse principal component analysis (SPCA) is a popular tool for dimensionality reduction in high-dimensional data. However, there is still a lack of theoretically justified Bayesian SPCA methods that can scale well computationally. One of the major challenges in Bayesian SPCA is selecting an appropriate prior for the loadings matrix, considering that principal components are mutually orthogonal. We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm. This algorithm utilizes a spike and slab prior, which incorporates parameter expansion to cope with the orthogonality constraint. Besides comparing to two popular SPCA approaches, we introduce the PX-EM algorithm as an EM analogue to the PX-CAVI algorithm for comparison. Through extensive numerical simulations, we demonstrate that the PX-CAVI algorithm outperforms these SPCA approaches, showcasing its superiority in terms of performance. We study the posterior contraction rate of the variational posterior, providing a novel contribution to the existing literature. The PX-CAVI algorithm is then applied to study a lung cancer gene expression dataset. The R package VBsparsePCA with an implementation of the algorithm is available on the Comprehensive R Archive Network (CRAN). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Almost sure convergence for weighted sums of pairwise PQD random variables.

Author: da Silva, João Lita
Subjects: *RANDOM variables, *LAW of large numbers, *REGRESSION analysis, *DEPENDENT variables
Abstract: We obtain strong laws of large numbers of Marcinkiewicz–Zygmund's type for weighted sums of pairwise positively quadrant dependent random variables stochastically dominated by a random variable X ∈ L p , 1 ⩽p < 2. We use our results to establish the strong consistency of estimators which emerge from regression models having pairwise positively quadrant-dependent errors. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Exact penalty method for knot selection of B-spline regression.

Author: Yagishita, Shotaro and Gotoh, Jun-ya
Abstract: This paper presents a new approach to selecting knots at the same time as estimating the B-spline regression model. Such simultaneous selection of knots and model is not trivial, but our strategy can make it possible by employing a nonconvex regularization on the least square method that is usually applied. More specifically, motivated by the constraint that directly designates (the upper bound of) the number of knots to be used, we present an (unconstrained) regularized least square reformulation, which is later shown to be equivalent to the motivating cardinality-constrained formulation. The obtained formulation is further modified so that we can employ a proximal gradient-type algorithm, known as GIST, for a class of nonconvex nonsmooth optimization problems. We show that under a mild technical assumption, the algorithm is shown to reach a local minimum of the problem. Since it is shown that any local minimum of the problem satisfies the cardinality constraint, the proposed algorithm can be used to obtain a spline regression model that depends only on a designated number of knots at most. Numerical experiments demonstrate how our approach performs on synthetic and real data sets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Bayesian weighted composite quantile regression estimation for linear regression models with autoregressive errors.

Author: Aghamohammadi, A. and Bahmani, M.
Subjects: *QUANTILE regression, *AUTOREGRESSIVE models, *REGRESSION analysis, *LAPLACE distribution, *GIBBS sampling, *DATA analysis
Abstract: Composite quantile regression methods have been shown to be effective techniques in improving the prediction accuracy. In this article, we propose a Bayesian weighted composite quantile regression estimation procedure to estimate unknown regression coefficients and autoregressive parameters in the linear regression models with autoregressive errors. A Bayesian joint hierarchical model is established using the working likelihood of the asymmetric Laplace distribution. Adaptive Lasso-penalized type priors are used on regression coefficients and autoregressive parameters of the model to conduct inference and variable selection simultaneously. A Gibbs sampling algorithm is developed to simulate the parameters from the posterior distributions. The proposed method is illustrated by some simulation studies and analyzing a real data set. Both simulation studies and real data analysis indicate that the proposed approach performs well. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Ridge estimation of covariance matrix from data in two classes.

Author: Zhou, Yi and Zhang, Bin
Subjects: *COVARIANCE matrices, *GAUSSIAN distribution, *DATA modeling
Abstract: This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite, whether the data size is larger or smaller than the data dimension. Furthermore, the ridge parameter is tuned through a cross-validation procedure. Lastly, the proposed ridge estimator is verified with better performance than the existing estimator from the data in two classes and the traditional ridge estimator only from the good data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Detecting shifts in Conway–Maxwell–Poisson profile with deviance residual-based CUSUM and EWMA charts under multicollinearity.

Author: Mammadova, Ulduz and Özkale, M. Revan
Abstract: Monitoring profiles with count responses is a common situation in industrial processes and for a count distributed process, the Conway–Maxwell–Poisson (COM-Poisson) regression model yields better outcomes for under- and overdispersed count variables. In this study, we propose CUSUM and EWMA charts based on the deviance residuals obtained from the COM-Poisson model, which are fitted by the PCR and r–k class estimators. We conducted a simulation study to evaluate the effect of additive and multiplicative types shifts in various shift sizes, the number of predictor, and several dispersion levels and to compare the performance of the proposed control charts with control charts in the literature in terms of average run length and standard deviation of run length. Moreover, a real data set is also analyzed to see the performance of the newly proposed control charts. The results show the superiority of the newly proposed control charts against some competitors, including CUSUM and EWMA control charts based on ML, PCR, and ridge deviance residuals in the presence of multicollinearity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Linear Regression, Covariate Selection and the Failure of Modelling

Author: Davies, Laurie
Subjects: Statistics - Methodology, 62J07
Abstract: It is argued that all model based approaches to the selection of covariates in linear regression have failed. This applies to frequentist approaches based on P-values and to Bayesian approaches although for different reasons. In the first part of the paper 13 model based procedures are compared to the model-free Gaussian covariate procedure in terms of the covariates selected and the time required. The comparison is based on seven data sets and three simulations. There is nothing special about these data sets which are often used as examples in the literature. All the model based procedures failed. In the second part of the paper it is argued that the cause of this failure is the very use of a model. If the model involves all the available covariates standard P-values can be used. The use of P-values in this situation is quite straightforward. As soon as the model specifies only some unknown subset of the covariates the problem being to identify this subset the situation changes radically. There are many P-values, they are dependent and most of them are invalid. The P-value based approach collapses. The Bayesian paradigm also assumes a correct model but although there are no conceptual problems with a large number of covariates there is a considerable overhead causing computational and allocation problems even for moderately sized data sets. The Gaussian covariate procedure is based on P-values which are defined as the probability that a random Gaussian covariate is better than the covariate being considered. These P-values are exact and valid whatever the situation. The allocation requirements and the algorithmic complexity are both linear in the size of the data making the procedure capable of handling large data sets. It outperforms all the other procedures in every respect., Comment: 26 pages 4 figures
Published: 2021

44. Variable Selection Using a Smooth Information Criterion for Distributional Regression Models

Author: O'Neill, Meadhbh and Burke, Kevin
Subjects: Statistics - Methodology, 62J07
Abstract: Modern variable selection procedures make use of penalization methods to execute simultaneous model selection and estimation. A popular method is the LASSO (least absolute shrinkage and selection operator), the use of which requires selecting the value of a tuning parameter. This parameter is typically tuned by minimizing the cross-validation error or Bayesian information criterion (BIC) but this can be computationally intensive as it involves fitting an array of different models and selecting the best one. In contrast with this standard approach, we have developed a procedure based on the so-called "smooth IC" (SIC) in which the tuning parameter is automatically selected in one step. We also extend this model selection procedure to the distributional regression framework, which is more flexible than classical regression modelling. Distributional regression, also known as multiparameter regression (MPR), introduces flexibility by taking account of the effect of covariates through multiple distributional parameters simultaneously, e.g., mean and variance. These models are useful in the context of normal linear regression when the process under study exhibits heteroscedastic behaviour. Reformulating the distributional regression estimation problem in terms of penalized likelihood enables us to take advantage of the close relationship between model selection criteria and penalization. Utilizing the SIC is computationally advantageous, as it obviates the issue of having to choose multiple tuning parameters.
Published: 2021

45. Least angle regression in tangent space and LASSO for generalized linear models

Author: Hirose, Yoshihiro
Published: 2024
Full Text: View/download PDF

46. Variable selection for additive models with missing data via multiple imputation

Author: Shimazu, Yuta, Yamaguchi, Takayuki, Hoshina, Ibuki A. J., and Matsui, Hidetoshi
Published: 2024
Full Text: View/download PDF

47. A Comparative Analysis of Implementing Adaptive Lasso Penalty in Hierarchical Data: Quantile versus Mean Regression

Author: Maryaki, Forouzan Jafari and Golalizadeh, Mousa
Published: 2024
Full Text: View/download PDF

48. Comparing solution paths of sparse quadratic minimization with a Stieltjes matrix.

Author: He, Ziyu, Han, Shaoning, Gómez, Andrés, Cui, Ying, and Pang, Jong-Shi
Subjects: *STATISTICAL learning, *POLYNOMIAL time algorithms, *QUADRATIC programming, *PARAMETER estimation, *MATRICES (Mathematics)
Abstract: This paper studies several solution paths of sparse quadratic minimization problems as a function of the weighing parameter of the bi-objective of estimation loss versus solution sparsity. Three such paths are considered: the " ℓ 0 -path" where the discontinuous ℓ 0 -function provides the exact sparsity count; the " ℓ 1 -path" where the ℓ 1 -function provides a convex surrogate of sparsity count; and the "capped ℓ 1 -path" where the nonconvex nondifferentiable capped ℓ 1 -function aims to enhance the ℓ 1 -approximation. Serving different purposes, each of these three formulations is different from each other, both analytically and computationally. Our results deepen the understanding of (old and new) properties of the associated paths, highlight the pros, cons, and tradeoffs of these sparse optimization models, and provide numerical evidence to support the practical superiority of the capped ℓ 1 -path. Our study of the capped ℓ 1 -path is interesting in its own right as the path pertains to computable directionally stationary (= strongly locally minimizing in this context, as opposed to globally optimal) solutions of a parametric nonconvex nondifferentiable optimization problem. Motivated by classical parametric quadratic programming theory and reinforced by modern statistical learning studies, both casting an exponential perspective in fully describing such solution paths, we also aim to address the question of whether some of them can be fully traced in strongly polynomial time in the problem dimensions. A major conclusion of this paper is that a path of directional stationary solutions of the capped ℓ 1 -regularized problem offers interesting theoretical properties and practical compromise between the ℓ 0 -path and the ℓ 1 -path. Indeed, while the ℓ 0 -path is computationally prohibitive and greatly handicapped by the repeated solution of mixed-integer nonlinear programs, the quality of ℓ 1 -path, in terms of the two criteria—loss and sparsity—in the estimation objective, is inferior to the capped ℓ 1 -path; the latter can be obtained efficiently by a combination of a parametric pivoting-like scheme supplemented by an algorithm that takes advantage of the Z-matrix structure of the loss function. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. How global warming data are modeled via some novel mathematical programming scenarios in distributed lag model?

Author: Özbay, Nimet and Toker, Selma
Subjects: *MULTICOLLINEARITY, *MATHEMATICAL programming, *GLOBAL warming, *CARBON emissions, *DATA modeling, *PARAMETER estimation
Abstract: The method of Almon reduces multicollinearity in some degree in distributed lag model, however multicollinearity may not be recovered since Almon estimator depends on the use of ordinary least squares technique. In this context, Almon ridge estimator including one biasing parameter is commonly preferred in this model. Based on recent advances, biased estimators that have more than one biasing parameter are stated as advantageous to one biasing parameter estimators. One of two-parameter estimators is Almon two-parameter ridge estimator of Özbay (Iran J Sci Tech Trans Sci 43: 1819–1828, 2019) which regulates the multicollinearity with its first biasing parameter and improves the quality of fit of regression with its second biasing parameter. As for another method to eliminate multicollinearity, exact linear restrictions are employed for the Almon two-parameter ridge estimator and restricted Almon two-parameter ridge estimator was introduced by Özbay and Toker (Considering linear constraints for Almon two parameter ridge estimation. 11th International Statistics Congress (ISC 2019), Muğla, Turkey, 2019). In this article, the issue of selecting the biasing parameters of the restricted and unrestricted Almon two-parameter ridge estimators is handled with the approach of mathematical programming instead of traditional selection methods. Different scenarios in which mean square error is minimized or coefficient of multiple determination is maximized are constituted by this mathematical programming approach. In real-life data analysis, we focus on global warming as a trend topic to demonstrate the effect of mathematical programming approach on the mentioned estimators. The dataset in question comprises carbon dioxide emission that has adverse effects on global warming via increasing average global temperature. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Liu-type estimator in Conway–Maxwell–Poisson regression model: theory, simulation and application.

Author: Tanış, Caner and Asar, Yasin
Subjects: *MULTICOLLINEARITY, *REGRESSION analysis, *MODEL theory, *MONTE Carlo method
Abstract: Recently, many authors have been motivated to propose a new regression estimator in the case of multicollinearity. The most well-known of these estimators are ridge, Liu and Liu-type estimators. Many studies on regression models have shown that the Liu-type estimator is a good alternative to the ridge and Liu estimators in the literature. We consider a new Liu-type estimator, an alternative to ridge and Liu estimators in Conway–Maxwell–Poisson regression model. Moreover, we study the theoretical properties of the Liu-type estimator, and we provide some theorems showing under which conditions that the Liu-type estimator is superior to the others. Since there are two parameters of the Liu-type estimator, we also propose a method to select the parameters. We designed a simulation study to demonstrate the superiority of the Liu-type estimator compared to the ridge and Liu estimators. We also evaluated the usefulness and superiority of the proposed regression estimator with a practical data example. As a result of the simulation and real-world data example, we conclude that the proposed regression estimator is superior to its competitors according to the mean square error criterion. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

1,069 results on '"62J07"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources