267 results
Search Results
2. Model-based joint curve registration and classification.
- Author
-
Tang, Lin, Zeng, Pengcheng, Qing Shi, Jian, and Kim, Won-Seok
- Subjects
- *
HYOID bone , *EXPECTATION-maximization algorithms , *RECORDING & registration , *LOGISTIC regression analysis , *CLASSIFICATION , *REGRESSION analysis , *CURVES - Abstract
In this paper, we consider the problem of classification of misaligned multivariate functional data. We propose to use a model-based approach for the joint registration and classification of such data. The observed functional inputs are modeled as a functional nonlinear mixed effects model containing a nonlinear functional fixed effect constructed upon warping functions to account for curve alignment, and a nonlinear functional random effects component to address the variability among subjects. The warping functions are also modeled to accommodate common effect within groups and the variability between subjects. Then, a functional logistic regression model defined upon the representation of the aligned curves and scalar inputs is used to account for curve classification. EM-based algorithms are developed to perform maximum likelihood inference of the proposed models. The identifiability of the registration model and the asymptotical properties of the proposed method are established. The performance of the proposed procedure is illustrated via simulation studies and an analysis of a hyoid bone movement data application. The statistical developments proposed in this paper were motivated by the hyoid bone movement study, the methodology is designed and presented generality and can be applied to numerous areas of scientific research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. An optimized machine learning technology scheme and its application in fault detection in wireless sensor networks.
- Author
-
Fan, Fang, Chu, Shu-Chuan, Pan, Jeng-Shyang, Lin, Chuang, and Zhao, Huiqi
- Subjects
- *
WIRELESS sensor networks , *MACHINE learning , *BACK propagation , *PARTICLE swarm optimization , *EDIBLE fats & oils , *INTERNET of things - Abstract
Aiming at the problem of fault detection in data collection in wireless sensor networks, this paper combines evolutionary computing and machine learning to propose a productive technical solution. We choose the classical particle swarm optimization (PSO) and improve it, including the introduction of a biological population model to control the population size, and the addition of a parallel mechanism for further tuning. The proposed RS-PPSO algorithm was successfully used to optimize the initial weights and biases of back propagation neural network (BPNN), shortening the training time and raising the prediction accuracy. Wireless sensor networks (WSN) has become the key supporting platform of Internet of Things (IoT). The correctness of the data collected by the sensor nodes has a great influence on the reliability, real-time performance and energy saving of the entire network. The optimized machine learning technology scheme given in this paper can effectively identify the fault data, so as to ensure the effective operation of WSN. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. A distributed multiple sample testing for massive data.
- Author
-
Xiaoyue, Xie, Shi, Jian, and Song, Kai
- Subjects
- *
FRAUD investigation , *STATISTICAL accuracy , *FRAUD - Abstract
When the data are stored in a distributed manner, direct application of traditional hypothesis testing procedures is often prohibitive due to communication costs and privacy concerns. This paper mainly develops and investigates a distributed two-node Kolmogorov–Smirnov hypothesis testing scheme, implemented by the divide-and-conquer strategy. In addition, this paper also provides a distributed fraud detection and a distribution-based classification for multi-node machines based on the proposed hypothesis testing scheme. The distributed fraud detection is to detect which node stores fraud data in multi-node machines and the distribution-based classification is to determine whether the multi-node distributions differ and classify different distributions. These methods can improve the accuracy of statistical inference in a distributed storage architecture. Furthermore, this paper verifies the feasibility of the proposed methods by simulation and real example studies. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. A new approach to modeling positive random variables with repeated measures.
- Author
-
de Freitas, João Victor B., Nobre, Juvêncio S., Bourguignon, Marcelo, and Santos-Neto, Manoel
- Subjects
- *
RANDOM variables , *MONTE Carlo method , *REGRESSION analysis , *GENERALIZED estimating equations - Abstract
In many situations, it is common to have more than one observation per experimental unit, thus generating the experiments with repeated measures. In the modeling of such experiments, it is necessary to consider and model the intra-unit dependency structure. In the literature, there are several proposals to model positive continuous data with repeated measures. In this paper, we propose one more with the generalization of the beta prime regression model. We consider the possibility of dependence between observations of the same unit. Residuals and diagnostic tools also are discussed. To evaluate the finite-sample performance of the estimators, using different correlation matrices and distributions, we conducted a Monte Carlo simulation study. The methodology proposed is illustrated with an analysis of a real data set. Finally, we create an R package for easy access to publicly available the methodology described in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. LIC criterion for optimal subset selection in distributed interval estimation.
- Author
-
Guo, Guangbao, Sun, Yue, Qian, Guoqi, and Wang, Qian
- Subjects
- *
SUBSET selection , *CLIENT/SERVER computing equipment , *ACQUISITION of data , *DATA analysis , *FEATURE selection - Abstract
Distributed interval estimation in linear regression may be computationally infeasible in the presence of big data that are normally stored in different computer servers or in cloud. The existing challenge represents the results from the distributed estimation may still contain redundant information about the population characteristics of the data. To tackle this computing challenge, we develop an optimization procedure to select the best subset from the collection of data subsets, based on which we perform interval estimation in the context of linear regression. The procedure is derived based on minimizing the length of the final interval estimator and maximizing the information remained in the selected data subset, thus is named as the LIC criterion. Theoretical performance of the LIC criterion is studied in this paper together with a simulation study and real data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. On detecting the effect of exposure mixture.
- Author
-
Liu, Xinhua and Jin, Zhezhen
- Subjects
- *
INDEPENDENT variables , *STATISTICAL software - Abstract
To study the effect of exposure mixture on the continuous health outcomes, one can use the linear model with a weighted sum of multiple standardized exposure variables as an index predictor and its coefficient for the overall effect. The unknown weights typically range between zero and one, indicating contributions of individual exposures to the overall effect. Because the weight parameters present only when the parameter for overall effect is non-zero, testing hypotheses on the overall effect can be challenging, especially when the number of exposure variables is above two. This paper presents a working model based approach to estimate the parameter for overall effect and to test specific hypotheses, including two tests for detecting the overall effect and one test for detecting unequal weights when the overall effect is evident. The statistics are computationally easy and one can apply existing statistical software to perform the analysis. A simulation study shows that the proposed estimators for the parameters of interest may have better finite sample performance than some other estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Goodness-of-fit inference for the additive hazards regression model with clustered current status data.
- Author
-
Feng, Yanqin, Wang, Jie, and Li, Yang
- Subjects
- *
REGRESSION analysis , *GOODNESS-of-fit tests , *HAZARDS , *SURVIVAL analysis (Biometry) , *ADDITIVES , *MEDICAL research - Abstract
Clustered current status data are frequently encountered in biomedical research and other areas that require survival analysis. This paper proposes graphical and formal model assessment procedures to evaluate the goodness of fit of the additive hazards model to clustered current status data. The test statistics proposed are based on sums of martingale-based residuals. Relevant asymptotic properties are established, and empirical distributions of the test statistics can be simulated utilizing Gaussian multipliers. Extensive simulation studies confirmed that the proposed test procedures work well for practical scenarios. This proposed method applies when failure times within the same cluster are correlated, and in particular, when cluster sizes can be informative about intra-cluster correlations. The method is applied to analyze clustered current status data from a lung tumorigenicity study. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. An adjusted partial least squares regression framework to utilize additional exposure information in environmental mixture data analysis.
- Author
-
Du, Ruofei, Luo, Li, Hudson, Laurie G., Nozadi, Sara, and Lewis, Johnnye
- Subjects
- *
PARTIAL least squares regression , *LEAST squares , *STANDARD deviations , *ENVIRONMENTAL exposure , *DATA analysis - Abstract
In a large-scale environmental health population study that is composed of subprojects, often different fractions of participants out of the total enrolled have measures of specific outcomes. It's conceptually reasonable to assume the association study would benefit from utilizing additional exposure information from those with a specific outcome not measured. Partial least squares regression is a practical approach to determine the exposure-outcome associations for mixture data. Like a typical regression approach, however, the partial least squares regression requires that each data observation must have both complete covariate and outcome for model fitting. In this paper, we propose novel adjustments to the general partial least squares regression to estimate and examine the association effects of individual environmental exposure to an outcome within a more complete context of the study population's environmental mixture exposures. The proposed framework takes advantage of the bilinear model structure. It allows information from all participants, with or without the outcome values, to contribute to the model fitting and the assessment of association effects. Using this proposed framework, incorporation of additional information will lead to smaller root mean square errors in the estimation of association effects, and improve the ability to assess the significance of the effects. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. The re-parameterized inverse Gaussian regression to model length of stay of COVID-19 patients in the public health care system of Piracicaba, Brazil.
- Author
-
Hashimoto, E. M., Ortega, E. M. M., Cordeiro, G. M., Cancho, V. G., and Silva, I.
- Subjects
- *
INVERSE Gaussian distribution , *LENGTH of stay in hospitals , *REGRESSION analysis , *CENSORING (Statistics) , *MEDICAL care , *HOSPITAL admission & discharge - Abstract
Among the models applied to analyze survival data, a standout is the inverse Gaussian distribution, which belongs to the class of models to analyze positive asymmetric data. However, the variance of this distribution depends on two parameters, which prevents establishing a functional relation with a linear predictor when the assumption of constant variance does not hold. In this context, the aim of this paper is to re-parameterize the inverse Gaussian distribution to enable establishing an association between a linear predictor and the variance. We propose deviance residuals to verify the model assumptions. Some simulations indicate that the distribution of these residuals approaches the standard normal distribution and the mean squared errors of the estimators are small for large samples. Further, we fit the new model to hospitalization times of COVID-19 patients in Piracicaba (Brazil) which indicates that men spend more time hospitalized than women, and this pattern is more pronounced for individuals older than 60 years. The re-parameterized inverse Gaussian model proved to be a good alternative to analyze censored data with non-constant variance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Beta-negative binomial nonlinear spatio-temporal random effects modeling of COVID-19 case counts in Japan.
- Author
-
Ueki, Masao
- Subjects
- *
COVID-19 pandemic , *RANDOM effects model , *NEGATIVE binomial distribution , *COVID-19 , *SARS-CoV-2 , *POISSON regression - Abstract
Coronavirus disease 2019 (COVID-19) caused by the SARS-CoV-2 virus has spread seriously throughout the world. Predicting the spread, or the number of cases, in the future can facilitate preparation for, and prevention of, a worst-case scenario. To achieve these purposes, statistical modeling using past data is one feasible approach. This paper describes spatio-temporal modeling of COVID-19 case counts in 47 prefectures of Japan using a nonlinear random effects model, where random effects are introduced to capture the heterogeneity of a number of model parameters associated with the prefectures. The negative binomial distribution is frequently used with the Paul-Held random effects model to account for overdispersion in count data; however, the negative binomial distribution is known to be incapable of accommodating extreme observations such as those found in the COVID-19 case count data. We therefore propose use of the beta-negative binomial distribution with the Paul-Held model. This distribution is a generalization of the negative binomial distribution that has attracted much attention in recent years because it can model extreme observations with analytical tractability. The proposed beta-negative binomial model was applied to multivariate count time series data of COVID-19 cases in the 47 prefectures of Japan. Evaluation by one-step-ahead prediction showed that the proposed model can accommodate extreme observations without sacrificing predictive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Inference of multicomponent stress-strength reliability following Topp-Leone distribution using progressively censored data.
- Author
-
Saini, Shubham, Tomer, Sachin, and Garg, Renu
- Subjects
- *
CENSORING (Statistics) , *MARKOV chain Monte Carlo , *ACCELERATED life testing , *GAMMA functions - Abstract
In this paper, the inference of multicomponent stress-strength reliability has been derived using progressively censored samples from Topp-Leone distribution. Both stress and strength variables are assumed to follow Topp-Leone distributions with different shape parameters. The maximum likelihood estimate along with the asymptotic confidence interval are developed. Boot-p and Boot-t confidence intervals are also constructed. The Bayes estimates under generalized entropy loss function based on gamma priors using Lindley's, Tierney-Kadane's approximation and Markov chain Monte Carlo methods are derived. A simulation study is considered to check the performance of various estimation methods and different censoring schemes. A real data study shows the applicability of the proposed estimation methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. A robust latent CUSUM chart for monitoring customer attrition.
- Author
-
Wu, Chunjie, Wang, Zhijun, MacEachern, Steven, and Schneider, Jingjing
- Subjects
- *
STATISTICAL process control , *CONSUMERS , *LEAST squares , *MARKOV processes , *CUSUM technique - Abstract
In competitive business, such as insurance and telecommunications, customers can easily replace one provider for another, which leads to customer attrition. Keeping customer attrition rate low is crucial for companies, since retaining a customer is more profitable than recruiting a new one. As a main statistical process control (SPC) method, the CUSUM scheme is able to detect small and persistent shifts in customer attrition. However, customer attrition summaries are typically available on an uneven time scale (e.g. 4-week and 5-week 'business month'), which may not satisfy the assumptions of traditional CUSUM designs. This paper mainly develops a latent CUSUM chart based on an exponential model for monitoring 'monthly' customer attrition, under varying time scales. Both maximum likelihood and least squares methods are studied, where the former mostly performs better and the latter is advantageous for quite small shifts. We apply a Markov chain algorithm to obtain the average run length (ARL), make calibrations for different combinations of parameters, and present reference tables of cutoffs. Three more complicated models are considered to test the robustness of deviations from the initial model. Furthermore, a real example of monitoring monthly customer attrition from a Chinese insurance company is used to illustrate the scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Spot It! and balanced block designs: keys to better debate architecture for a plethora of candidates in presidential primaries?
- Author
-
Potthoff, Richard F.
- Subjects
- *
PRIMARIES , *BLOCK designs , *PRESIDENTIAL candidates , *CAMPAIGN debates , *DIFFERENCE sets - Abstract
U.S. presidential primary debates are influential but under-researched. Before 2015, all of these debates, both Democratic and Republican, had 10 candidates or fewer. The first Republican debate in 2015, however, abided 17 candidates. They were split into two segments, with the 10 best-polling candidates in the main (prime-time) segment and the others in an 'undercard' session. A comparable pattern applied for the next six Republican debates. Concern arose not only because many candidates were crowded into a session but also because the undercard candidates were seen as receiving inferior exposure. The Democratic presidential primary debates that started four years later encountered similar difficulty. An authorized policy caused their candidates in each of the first two debates to be limited to 20, randomly divided into two groups of 10 appearing on successive nights. For remedy, this paper examines innovative debate plans, for different numbers of candidates, that feature symmetry among all candidates and entail many short segments with relatively few candidates in each. We apply combinatorial designs—balanced incomplete block designs and regular pairwise balanced designs, which are analogous to the games Spot It Jr.! Animals and (full-fledged) Spot It!, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Local Linear Regression and the problem of dimensionality: a remedial strategy via a new locally adaptive bandwidths selector.
- Author
-
Eguasa, O., Edionwe, E, and Mbegbu, J. I.
- Subjects
- *
BANDWIDTHS , *RESPONSE surfaces (Statistics) , *RECURRENT neural networks , *REGRESSION analysis - Abstract
Local Linear Regression (LLR) is a nonparametric regression model applied in the modeling phase of Response Surface Methodology (RSM). LLR does not make reference to any fixed parametric model. Hence, LLR is flexible and can capture local trends in the data that might be too complicated for the OLS. However, besides the small sample size and sparse data which characterizes RSM, the performance of the LLR model nosedives as the number of explanatory variables considered in the study increases. This phenomenon, popularly referred to as curse of dimensionality, results in the scanty application of LLR in RSM. In this paper, we propose a novel locally adaptive bandwidths selector, unlike the fixed bandwidths and existing locally adaptive bandwidths selectors, takes into account both the number of the explanatory variables in the study and their individual values at each data point. Single and multiple response problems from the literature and simulated data were used to compare the performance of the L L R P A B with those of the OLS, L L R F B and L L R A B . Neural network activation functions such ReLU, Leaky-ReLU, SELU and SPOCU was considered and give a remarkable improvement on the loss function (Mean Squared Error) over the regression models utilized in the three data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Inference for a discretized stochastic logistic differential equation and its application to biological growth.
- Author
-
Delgado-Vences, F., Baltazar-Larios, F., Vargas, A. Ornelas, Morales-Bojórquez, E., Cruz-Escalona, V. H., and Salomón Aguilar, C.
- Subjects
- *
STOCHASTIC differential equations , *MAXIMUM likelihood statistics , *EXPECTATION-maximization algorithms - Abstract
In this paper, we present a method to adjust a stochastic logistic differential equation (SLDE) to a set of highly sparse real data. We assume that the SLDE have two unknown parameters to be estimated. We calculate the Maximum Likelihood Estimator (MLE) to estimate the intrinsic growth rate. We prove that the MLE is strongly consistent and asymptotically normal. For estimating the diffusion parameter, the quadratic variation of the data is used. We validate our method with several types of simulated data. For more realistic cases in which we observe discretizations of the solution, we use diffusion bridges and the stochastic expectation-maximization algorithm to estimate the parameters. Furthermore, when we observe only one point for each path for a given number of trajectories we were still able to estimate the parameters of the SLDE. As far as we know, this is the first attempt to fit stochastic differential equations (SDEs) to these types of data. Finally, we apply our method to real data coming from fishery. The proposed adjustment method can be applied to other examples of SDEs and is highly applicable in several areas of science, especially in situations of sparse data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. Review of Bayesian selection methods for categorical predictors using JAGS.
- Author
-
Jreich, Rana, Hatte, Christine, and Parent, Eric
- Subjects
- *
BAYESIAN field theory , *SENSITIVITY analysis - Abstract
The formulation of variable selection has been widely developed in the Bayesian literature by linking a random binary indicator to each variable. This Bayesian inference has the advantage of stochastically exploring the set of possible sub-models, whatever their dimension. Bayesian selection approaches, appropriate for categorical predictors, are generally beyond the scope of the standard Bayesian selection of regressors in the linear model since all levels of a categorical variable should be jointly handled in the selection procedure. For categorical covariates, new strategies have been developed to detect the effect of grouped covariates rather than the single effect of a quantitative regressor. In this paper, we review three Bayesian selection methods for categorical predictors: Bayesian Group Lasso with Spike and Slab priors, Bayesian Sparse Group Selection and Bayesian Effect Fusion using model-based clustering. The motivation behind this paper is to provide detailed information about the implementation of the three Bayesian selection methods mentioned above, appropriate for categorical predictors, using the JAGS software. Selection performance and sensitivity analysis of the hyperparameters tuning for prior specifications are assessed under various simulated scenarios. JAGS helps user implement these three Bayesian selection methods for more complex model structures such as hierarchical ones with latent layers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Optimal designing of two-level skip-lot sampling reinspection plan.
- Author
-
Murugeswari, N., Jeyadurga, P., and Balamurali, S.
- Subjects
- *
INDUSTRIAL costs , *INDUSTRIAL applications - Abstract
Skip-lot sampling plan is often applied in industries for reducing the cost and effort of the inspection of the product having excellent quality history. Consequence of skip-lot sampling plans is to reduce the cost of inspection so which are more attractive in economical aspect. In this paper, we develop a sampling plan by incorporating the idea of resampling in two-level skip lot sampling plan and the new plan is designated as SkSP-2L.1-R. This paper presents the Markov chain formulation of the proposed plan along with the derivation of performance measures of the plan. We also provide the designing methodology to determine the optimal parameters of the SkSP-2L.1-R plan so as to minimize the average sample number by using two points on the operating characteristic curve approach. By contemplating various combinations of producer and consumer quality levels along with respective risks, a table is constructed to determine the optimal parameters. An industrial application of the proposed SkSP-2L.1-R plan is discussed. The SkSP-2L.1-R with single sampling plan as a reference plan is compared with the conventional single sampling plan, SkSP-2 plan and SkSP-2-R plan and proved that the proposed SkSP-2L.1-R plan outperforms these plans. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. A semi-analytical solution to the maximum-likelihood fit of Poisson data to a linear model using the Cash statistic.
- Author
-
Bonamente, Massimiliano and Spence, David
- Subjects
- *
DATA modeling , *INDEPENDENT variables , *ANALYTICAL solutions , *PARAMETER estimation - Abstract
The Cash statistic, also known as the C statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional χ 2 statistic. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Time-varying coefficient cumulative gap time models for intensive longitudinal ecological momentary assessment data with missingness.
- Author
-
Li, Xiaoxue, Anderson, Stewart J., Shiffman, Saul, and Zhang, Bo
- Subjects
- *
ECOLOGICAL momentary assessments (Clinical psychology) , *SMOKING statistics , *MISSING data (Statistics) , *MEMORY bias - Abstract
Ecological momentary assessment (EMA) studies investigate intensive repeated observations of the current behavior and experiences of subjects in real time. In particular, such studies aim to minimize recall bias and maximize ecological validity, thereby strengthening the investigation and inference of microprocesses that influence behavior in real-world contexts by gathering intensive information on the temporal patterning of behavior of study subjects. Throughout this paper, we focus on the data analysis of an EMA study that examined behavior of intermittent smokers (ITS). Specifically, we sought to explore the pattern of clustered smoking behavior of ITS, or smoking 'bouts', as well as the covariates that predict such smoking behavior. To do this, in this paper we introduce a framework for characterizing the temporal behavior of ITS via the functions of event gap time to distinguish the smoking bouts. We used the time-varying coefficient models for the cumulative log gap time and to characterize the temporal patterns of smoking behavior, while simultaneously adjusting for behavioral covariates, and incorporated the inverse probability weighting into the models to accommodate missing data. Simulation studies showed that irrespective of whether missing by design or missing at random, the model was able to reliably determine prespecified time-varying functional forms of a given covariate coefficient, provided the the within-subject level was small. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Predictive analysis for joint progressive censoring plans: a Bayesian approach.
- Author
-
Ahmadi, Mohammad Vali and Doostparast, Mahdi
- Subjects
- *
CENSORSHIP , *MANUFACTURING processes , *UNITS of time , *CENSORING (Statistics) , *DISTRIBUTION (Probability theory) , *ERROR functions - Abstract
Comparative lifetime experiments are of particular importance in production processes when one wishes to determine the relative merits of several competing products with regard to their reliability. This paper confines itself to the data obtained by running a joint progressive Type-II censoring plan on samples in a combined manner. The problem of Bayesian predicting failure times of surviving units is discussed in details when parent populations are exponential. Two real data sets are analyzed in order to illustrate all the inferential procedures developed here. When destructive experiments under a censoring scheme finished, the researchers are usually interested to estimate remaining lifetimes of surviving units for sequel experiments. Findings of this paper are useful for these purposes specially when samples are non-homogeneous such as those taken from industrial storages. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. A random effect regression based on the odd log-logistic generalized inverse Gaussian distribution.
- Author
-
Vasconcelos, J. C. S., Cordeiro, G. M., Ortega, E. M. M., and Silva, G. O.
- Subjects
- *
INVERSE Gaussian distribution , *GAUSSIAN distribution , *RANDOM effects model , *REGRESSION analysis , *PRICES - Abstract
In recent decades, the use of regression models with random effects has made great progress. Among these models' attractions is the flexibility to analyze correlated data. In various situations, the distribution of the response variable presents asymmetry or bimodality. In these cases, it is possible to use the normal regression with random effect at the intercept. In light of these contexts, i.e. the desire to analyze correlated data in the presence of bimodality or asymmetry, in this paper we propose a regression model with random effect at the intercept based onthe generalized inverse Gaussian distribution model with correlated data. The maximum likelihood is adopted to estimate the parameters and various simulations are performed for correlated data. A type of residuals for the new regression is proposed whose empirical distribution is close to normal. The versatility of the new regression is demonstrated by estimating the average price per hectare of bare land in 10 municipalities in the state of São Paulo (Brazil). In this context, various databases are constantly emerging, requiring flexible modeling. Thus, it is likely to be of interest to data analysts, and can make a good contribution to the statistical literature. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Generalized fiducial methods for testing the homogeneity of a three-sample problem with a mixture structure.
- Author
-
Ren, Pengcheng, Liu, Guanfu, and Pu, Xiaolong
- Subjects
- *
TEST methods , *HOMOGENEITY , *MIXTURES , *SAMPLE size (Statistics) - Abstract
Recently, the likelihood ratio (LR) test was proposed to test the homogeneity of a three-sample model with a mixture structure. Because of the presence of the mixture structure, the null limiting distribution of the LR test has a complicated supremum form, which leads to challenges in determining p-values. In addition, the LR test cannot control type-I errors well under small to moderate sample size. In this paper, we propose seven generalized fiducial methods to test the homogeneity of the three-sample model. Via simulation studies, we find that our methods perform significantly better than the LR test method in controlling the type-I errors under small to moderate sample size, while they have comparable powers in most cases. A halibut data example is used to illustrate the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Homogeneity test of relative risk ratios for stratified bilateral data under different algorithms.
- Author
-
Mou, Ke-Yi, Ma, Chang-Xing, and Li, Zhi-Ming
- Subjects
- *
FALSE positive error , *MEAN square algorithms , *MONTE Carlo method , *MAXIMUM likelihood statistics , *HOMOGENEITY , *ODDS ratio - Abstract
Medical clinical studies about paired body parts often involve stratified bilateral data. The correlation between responses from paired parts should be taken into account to avoid biased or misleading results. This paper aims to test if the relative risk ratios across strata are equal under the optimal algorithms. Based on different algorithms, we obtain the desired global and constrained maximum likelihood estimations (MLEs). Three asymptotic test statistics (i.e. T L , T S C and T W ) are proposed. Monte Carlo simulations are conducted to evaluate the performance of these algorithms with respect to mean square errors of MLEs and convergence rate. The empirical results show Fisher scoring algorithm is usually better than other methods since it has effective convergence rate for global MLEs, and makes mean-square error lower for constrained MLEs. Three test statistics are compared in terms of type I error rate (TIE) and power. Among these statistics, T S C is recommended according to its robust TIEs and satisfactory power. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Ultrastructural calibration model for proficiency testing.
- Author
-
Aoki, Reiko, Leão, Dorival, Bustamante, Juan P. Mamani, and Vilca, Filidor
- Subjects
- *
FISHER information , *RANDOM matrices , *ERRORS-in-variables models - Abstract
Proficiency testing (PT) determines the performance of individual laboratories for specific tests or measurements and it is used to monitor the reliability of laboratories measurements. PT plays a highly valuable role as it provides objective evidence of the competence of the participant laboratories. In this paper, we propose a multivariate calibration model to assess equivalence among laboratories measurements in PT. Our method allows to deal with multivariate data, where the item under test is measured at different levels. Although intuitive, the proposed model is nonergodic, which means that the asymptotic Fisher information matrix is random. As a consequence, a detailed asymptotic analysis was carried out to establish the strategy for comparing the results of the participating laboratories. To illustrate, we apply our method to analyze the data from the Brazilian engine test group, PT program, where the power of an engine was measured by eight laboratories at several levels of rotation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Applications of monitoring and tracing the evolution of clustering solutions in dynamic datasets.
- Author
-
Atif, Muhammad, Shafiq, Muhammad, and Leisch, Friedrich
- Subjects
- *
DATA mining , *POLICY sciences - Abstract
The clustering approach is widely accepted as the most prominent unsupervised learning problem in data mining techniques. This procedure deals with the identification of notable structures in unlabeled datasets. In modern days clustering of dynamic data, streams play a vital role in policy-making, and researchers are paying particular attention to monitoring the evolution of clustering solutions over time. The data streams evolve continually, and different sources generate data items over time. The clustering solution over this stream is not stationary and changes with the influx of new data items. This paper presents a comprehensive study of algorithms related to tracing the evolution of clusters over time in cumulative datasets. To demonstrate the applications and significance of the tracing cluster evolution, we implement the MONIC algorithm in R-software. This article illustrates how the data segmentation of dynamic streams is done and shows the applications of monitoring changes in clustering solutions with the help of real-life published datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Estimating the parameters of a dependent model and applying it to environmental data set.
- Author
-
Mohtashami-Borzadaran, V., Amini, M., and Ahmadi, J.
- Subjects
- *
MONTE Carlo method , *MAXIMUM likelihood statistics , *MOMENTS method (Statistics) , *RANDOM numbers , *NUMBER systems - Abstract
In this paper, a new dependent model is introduced. The model is motivated using the structure of series-parallel systems consisting of two series-parallel systems with a random number of parallel sub-systems that have fixed components connected in series. The dependence properties of the proposed model are studied. Two estimation methods, namely the moment method, and the maximum likelihood method are applied to estimate the parameters of the distributions of the components based on observing the system's lifetime data. A Monte Carlo simulation study is used to evaluate the performance of the estimators. Two real data sets are used to illustrate the proposed method. The results are useful for researchers and practitioners interested in analyzing bivariate data related to extreme events. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Curve fitting and jump detection on nonparametric regression with missing data.
- Author
-
Li, Qianyi, Li, Jianbo, Cheng, Yongran, and Zhang, Riquan
- Subjects
- *
MISSING data (Statistics) , *NONPARAMETRIC estimation , *SUM of squares , *REGRESSION analysis , *CURVE fitting , *CONJUNCTIVITIS - Abstract
In this paper, by virtual of the inverse probability weighted technique, we considered the jump-preserving estimation on the nonparametric regression models with missing data on response variable. First, we used local piecewise-linear expansion respectively with left and right kernel to approximate the unknown regression function. Second, we obtained the left- and right-limit estimation of regression function at each observed points and then determinated the final estimators by residual sums of squares. Third, we presented the convergence rate of estimators and the residual sums of squares. Finally, we illustrated the performance of our proposed method through some simulation studies and a conjunctivitis example from The Affiliated Hospital of Hangzhou Normal University. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Monitoring the Weibull shape parameter under progressive censoring in presence of independent competing risks.
- Author
-
Moharib Alsarray, Rusul Mohsin, Kazempoor, Jaber, and Ahmadi Nadi, Adel
- Subjects
- *
QUALITY control charts , *COMPETING risks , *WEIBULL distribution , *CENSORSHIP , *SAMPLE size (Statistics) - Abstract
In this paper, monitoring the Weibull shape parameter arising from progressively censored competing risks data is investigated. The competing risks are assumed to be independent and not identically distributed from the Weibull distributions with different shape and scale parameters. Both the shape parameters can be monitored separately by the proposed control charts using censored and predicted observations. We also introduced a control chart for monitoring both shape parameters simultaneously to detect possible shifts in both opposite and the same directions. In addition, the problem of mask data is discussed and an efficient prediction method is proposed. The behavior of the average run length with and without mask data is investigated through extensive simulations. Furthermore, the effects of sample size, number of failures due to each risk, and censoring scheme on the charts' performance are also studied. Finally, an illustrative example is presented to demonstrate the application of the proposed control charts by investigating a real data set of the failure times of two-component ARC-1 VHF communication transmitter receivers of a single commercial airline. Although this data set has been widely investigated in reliability analysis studies, this is the first time it has been analyzed in a statistical process monitoring setting. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Functional distributional clustering using spatio-temporal data.
- Author
-
Venkatasubramaniam, A., Evers, L., Thakuriah, P., and Ampountolas, K.
- Subjects
- *
HIERARCHICAL clustering (Cluster analysis) , *CUMULATIVE distribution function , *SENSOR networks , *CENTRAL business districts - Abstract
This paper presents a new method called the functional distributional clustering algorithm (FDCA) that seeks to identify spatially contiguous clusters and incorporate changes in temporal patterns across overcrowded networks. This method is motivated by a graph-based network composed of sensors arranged over space where recorded observations for each sensor represent a multi-modal distribution. The proposed method is fully non-parametric and generates clusters within an agglomerative hierarchical clustering approach based on a measure of distance that defines a cumulative distribution function over temporal changes for different locations in space. Traditional hierarchical clustering algorithms that are spatially adapted do not typically accommodate the temporal characteristics of the underlying data. The effectiveness of the FDCA is illustrated using an application to both empirical and simulated data from about 400 sensors in a 2.5 square miles network area in downtown San Francisco, California. The results demonstrate the superior ability of the the FDCA in identifying true clusters compared to functional only and distributional only algorithms and similar performance to a model-based clustering algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. The unit log–log distribution: a new unit distribution with alternative quantile regression modeling and educational measurements applications.
- Author
-
Korkmaz, Mustafa Ç. and Korkmaz, Zehra Sedef
- Subjects
- *
QUANTILE regression , *REGRESSION analysis , *EDUCATIONAL tests & measurements , *MONTE Carlo method , *STOCHASTIC orders , *MAXIMUM likelihood statistics - Abstract
In this paper, we propose a new distribution, named unit log–log distribution, defined on the bounded (0,1) interval. Basic distributional properties such as model shapes, stochastic ordering, quantile function, moments, and order statistics of the newly defined unit distribution are studied. The maximum likelihood estimation method has been pointed out to estimate its model parameters. The new quantile regression model based on the proposed distribution is introduced and it has been derived estimations of its model parameters also. The Monte Carlo simulation studies have been given to see the performance of the estimation method based on the new unit distribution and its regression modeling. Applications of the newly defined distribution and its quantile regression model to real data sets show that the proposed models have better modeling abilities than competitive models. The proposed unit quantile regression model has targeted to explain linear relation between educational measurements of both OECD (Organization for Economic Co-operation and Development) countries and some non-members of OECD countries, and their Better Life Index. The existence of the significant covariates has been seen on the real data applications for the unit median response. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Testing quantitative trait locus effects in genetic backcross studies with double recombination occurring.
- Author
-
Liu, Guanfu and Hu, Zongliang
- Subjects
- *
LOCUS (Genetics) , *LIKELIHOOD ratio tests - Abstract
Testing the existence of quantitative trait locus (QTL) effects is an important task in QTL mapping studies. In this paper, we assume the phenotype distributions from a location-scale distribution family, and consider to test the QTL effects in both location and scale in the backcross studies with double recombination occurring. Without equal scale assumption, the log-likelihood function is unbounded, which leads to the traditional likelihood ratio test being invalid. To deal with this problem, we propose a penalized likelihood ratio test (PLRT) for testing the QTL effects. The null limiting distribution of the PLRT is shown to be a supremum of a chi-square process. As a complement, we also investigate the null limiting distribution of the likelihood ratio test for the case with equal scale assumption. The limiting distributions of the two tests under local alternatives are also studied. Simulation studies are performed to evaluate the asymptotic results and a real-data example is given for illustration. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference scale-invariant test.
- Author
-
Zhang, Liang, Zhu, Tianming, and Zhang, Jin-Ting
- Subjects
- *
CHI-square distribution , *NULL hypothesis , *COVARIANCE matrices - Abstract
For high-dimensional two-sample Behrens–Fisher problems, several non-scale-invariant and scale-invariant tests have been proposed. Most of them impose strong assumptions on the underlying group covariance matrices so that their test statistics are asymptotically normal. However, in practice, these assumptions may not be satisfied or hardly be checked so that these tests may not be able to maintain the nominal size well in practice. To overcome this difficulty, in this paper, a normal reference scale-invariant test is proposed and studied. It works well by neither imposing strong assumptions on the underlying group covariance matrices nor assuming their equality. It is shown that under some regularity conditions and the null hypothesis, the proposed test and a chi-square-type mixture have the same normal and non-normal limiting distributions. It is then justifiable to approximate the null distribution of the proposed test using that of the chi-square-type mixture. The distribution of the chi-square type mixture can be well approximated by the Welch–Satterthwaite chi-square-approximation with the approximation parameter consistently estimated from the data. The asymptotic power of the proposed test is established. Numerical results demonstrate that the proposed test has much better size control and power than several well-known non-scale-invariant and scale-invariant tests. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Noise-insensitive discriminative subspace fuzzy clustering.
- Author
-
Zhi, Xiaobin, Yu, Tongjun, Bi, Longtao, and Li, Yalan
- Subjects
- *
FISHER discriminant analysis , *FUZZY algorithms , *EUCLIDEAN distance , *EXPONENTIAL functions - Abstract
Discriminative subspace clustering (DSC) can make full use of linear discriminant analysis (LDA) to reduce the dimension of data and achieve effective clustering high-dimension data by clustering low-dimension data in discriminant subspace. However, most existing DSC algorithms do not consider the noise and outliers that may be contained in data sets, and when they are applied to the data sets with noise or outliers, and they often obtain poor performance due to the influence of noise and outliers. In this paper, we address the problem of the sensitivity of DSC to noise and outlier. Replacing the Euclidean distance in the objective function of LDA by an exponential non-Euclidean distance, we first develop a noise-insensitive LDA (NILDA) algorithm. Then, combining the proposed NILDA and a noise-insensitive fuzzy clustering algorithm: AFKM, we propose a noise-insensitive discriminative subspace fuzzy clustering (NIDSFC) algorithm. Experiments on some benchmark data sets show the effectiveness of the proposed NIDSFC algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Classification of histogram-valued data with support histogram machines.
- Author
-
Kang, Ilsuk, Park, Cheolwoo, Yoon, Young Joo, Park, Changyi, Kwon, Soon-Sun, and Choi, Hosik
- Subjects
- *
HISTOGRAMS , *SUPPORT vector machines , *CLASSIFICATION - Abstract
The current large amounts of data and advanced technologies have produced new types of complex data, such as histogram-valued data. The paper focuses on classification problems when predictors are observed as or aggregated into histograms. Because conventional classification methods take vectors as input, a natural approach converts histograms into vector-valued data using summary values, such as the mean or median. However, this approach forgoes the distributional information available in histograms. To address this issue, we propose a margin-based classifier called support histogram machine (SHM) for histogram-valued data. We adopt the support vector machine framework and the Wasserstein-Kantorovich metric to measure distances between histograms. The proposed optimization problem is solved by a dual approach. We then test the proposed SHM via simulated and real examples and demonstrate its superior performance to summary-value-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Model estimation and selection for partial linear varying coefficient EV models with longitudinal data.
- Author
-
Zhao, Mingtao, Xu, Xiaoli, Zhu, Yanling, Zhang, Kongsheng, and Zhou, Yan
- Subjects
- *
PANEL analysis , *REGULARIZATION parameter , *MEASUREMENT errors , *DATA modeling - Abstract
In this paper, we consider the estimation and model selection for longitudinal partial linear varying coefficient errors-in-variables (EV) models when the covariates are measured with some additive errors. Bias-corrected penalized quadratic inference functions method is proposed based on quadratic inference functions with two penalty function terms. The proposed method can not only handle the measurement errors of covariates and within-subject correlations but also estimate and select significant non-zero parametric and nonparametric components simultaneously. With some regularization conditions, the resulting estimators of parameters are asymptotically normal and the estimators of nonparametric varying coefficient achieves the optimal convergence rate. Furthermore, we present simulation studies and a real example analysis to evaluate the finite sample performance of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. An effective deep residual network based class attention layer with bidirectional LSTM for diagnosis and classification of COVID-19.
- Author
-
Pustokhin, Denis A., Pustokhina, Irina V., Dinh, Phuoc Nguyen, Phan, Son Van, Nguyen, Gia Nhu, Joshi, Gyanendra Prasad, and K., Shankar
- Subjects
- *
COVID-19 testing , *FEATURE extraction , *DEEP learning , *COVID-19 pandemic , *MEDICAL screening - Abstract
In recent days, COVID-19 pandemic has affected several people's lives globally and necessitates a massive number of screening tests to detect the existence of the coronavirus. At the same time, the rise of deep learning (DL) concepts helps to effectively develop a COVID-19 diagnosis model to attain maximum detection rate with minimum computation time. This paper presents a new Residual Network (ResNet) based Class Attention Layer with Bidirectional LSTM called RCAL-BiLSTM for COVID-19 Diagnosis. The proposed RCAL-BiLSTM model involves a series of processes namely bilateral filtering (BF) based preprocessing, RCAL-BiLSTM based feature extraction, and softmax (SM) based classification. Once the BF technique produces the preprocessed image, RCAL-BiLSTM based feature extraction process takes place using three modules, namely ResNet based feature extraction, CAL, and Bi-LSTM modules. Finally, the SM layer is applied to categorize the feature vectors into corresponding feature maps. The experimental validation of the presented RCAL-BiLSTM model is tested against Chest-X-Ray dataset and the results are determined under several aspects. The experimental outcome pointed out the superior nature of the RCAL-BiLSTM model by attaining maximum sensitivity of 93.28%, specificity of 94.61%, precision of 94.90%, accuracy of 94.88%, F-score of 93.10% and kappa value of 91.40%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. A practical two-sample test for weighted random graphs.
- Author
-
Yuan, Mingao and Wen, Qian
- Subjects
- *
RANDOM graphs , *WEIGHTED graphs , *GAUSSIAN distribution , *NULL hypothesis , *MACHINE learning , *DATA analysis - Abstract
Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph populations. Several statistical tests have been devised for this purpose in the context of binary graphs. However, many of the practical networks are weighted and existing procedures cannot be directly applied to weighted graphs. In this paper, we study the weighted graph two-sample hypothesis testing problem and propose a practical test statistic. We prove that the proposed test statistic converges in distribution to the standard normal distribution under the null hypothesis and analyze its power theoretically. The simulation study shows that the proposed test has satisfactory performance and it substantially outperforms the existing counterpart in the binary graph case. A real data application is provided to illustrate the method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Bayes factor testing of equality and order constraints on measures of association in social research.
- Author
-
Mulder, Joris and Gelissen, John P. T. M.
- Subjects
- *
SOCIAL science research , *RESEARCH questions , *SATISFACTION , *EQUALITY , *HYPOTHESIS - Abstract
Measures of association play a central role in the social sciences to quantify the strength of a linear relationship between the variables of interest. In many applications researchers can translate scientific expectations to hypotheses with equality and/or order constraints on these measures of association. In this paper a Bayes factor test is proposed for testing multiple hypotheses with constraints on the measures of association between ordinal and/or continuous variables, possibly after correcting for certain covariates. This test can be used to obtain a direct answer to the research question how much evidence there is in the data for a social science theory relative to competing theories. The stand-alone software package 'BCT' allows users to apply the methodology in an easy manner. The methodology will also be available in the R package 'BFpack'. An empirical application from leisure studies about the associations between life, leisure and relationship satisfaction and an application about the differences about egalitarian justice beliefs across countries are used to illustrate the methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. A novel group VIF regression for group variable selection with application to multiple change-point detection.
- Author
-
Ding, Hao, Zhang, Yan, and Wu, Yuehua
- Subjects
- *
CHANGE-point problems , *BIG data , *FAT , *AIR pollution - Abstract
In this paper, we propose a novel group variance inflation factor (VIF) regression model for tackling large data sets where data follows a grouped structure. Unlike classical penalized methods, this approach can perform group variable selection in a sparse model, which is quite different from the classical penalized methods. We further adapt the proposed method associated with a two-stage procedure for detecting multiple change-point in linear models. We carry out extensive simulation studies to show that the proposed group variable selection and change-point detection methods are stable and efficient. Finally, we provide two real data examples, including a body fat data set and an air pollution data set, to illustrate the performance of our algorithms in group selection and change-point detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Inferences for multiple interval type-I censoring scheme.
- Author
-
Agnihotri, Shubham, Kumar Singh, Sanjay, and Singh, Umesh
- Subjects
- *
MAXIMUM likelihood statistics , *BAYES' estimation , *CENSORSHIP , *ERROR functions - Abstract
In this paper, we have introduced a new type of censoring scheme named the multiple interval type-I censoring scheme. Further, We have assumed that the test units are drawn from the Weibull population. We have also proposed the maximum product of spacing estimators for unknown parameters under the multiple interval type-I censoring scheme and compare them with the existing maximum likelihood estimators. In addition to this, the Bayes estimators for shape and scale parameters are also obtained under the squared error loss function. Their corresponding asymptotic confidence/credible intervals are also discussed. A real data set containing the breakdown time of insulating fluids are used to demonstrate the appropriateness of the proposed methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Cubic rank transmuted generalized Gompertz distribution: properties and applications.
- Author
-
Taniş, Caner and Saraçoğlu, Buğra
- Subjects
- *
DISTRIBUTION (Probability theory) , *MONTE Carlo method , *GENERATING functions , *KURTOSIS - Abstract
In this paper, we introduce a new lifetime distribution as an alternative to generalized Gompertz, Gompertz distribution and its modified ones. This new distribution is a special case of the family of distributions introduced by Granzotto et al. [D.C.T. Granzotto, F. Louzada and N. Balakrishnan, Cubic rank transmuted distributions: inferential issues and applications., J. Stat. Comput. Simul. 87 (2017), pp. 2760–2778]. We obtain some characteristic properties of suggested distribution such as hazard function, ordinary moments, coefficient of skewness, coefficient of kurtosis, moment generating function, quantile function and median. We discuss three different methods of estimation to estimate the parameters of proposed distribution. A comprehensive Monte Carlo simulation study is performed in order to compare the performances of estimators according to mean square errors and biases. Finally, three real data applications are performed to illustrate usefulness of suggested distribution. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Misspecification of a binary dependent variable in the logistic model controlling for the repeated longitudinal measures.
- Author
-
Wang, Chun-Chao, Hwang, Yi-Ting, Chou, Chung-Chuan, and Lee, Hui-Ling
- Subjects
- *
DEPENDENT variables , *MONTE Carlo method , *LOGISTIC regression analysis , *EXPECTATION-maximization algorithms , *ATRIAL fibrillation , *LATENT variables - Abstract
Many medical applications are interested to know the disease status. The disease status can be related to multiple serial measurements. Nevertheless, owing to various reasons, the binary outcome can be measured incorrectly. The estimators derived from the misspecified outcome can be biased. This paper derives the complete data likelihood function to incorporate both the multiple serial measurements and the misspecified outcome. Owing to the latent variables, EM algorithm is used to derive the maximum-likelihood estimators. Monte Carlo simulations are conducted to compare the impact of misspecification on the estimates. A retrospective data for the recurrence of atrial fibrillation is used to illustrate the usage of the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Monitoring process mean and dispersion with one double generally weighted moving average control chart.
- Author
-
Chatterjee, Kashinath, Koukouvinos, Christos, and Lappa, Angeliki
- Subjects
- *
QUALITY control charts , *MOVING average process , *PROCESS control systems , *STATISTICAL process control , *AUTOMOBILE engine manufacturing , *PISTON rings - Abstract
Control charts are widely known quality tools used to detect and control industrial process deviations in Statistical Process Control. In the current paper, we propose a new single memory-type control chart, called the maximum double generally weighted moving average chart (referred as Max-DGWMA), that simultaneously detects shifts in the process mean and/or process dispersion. The run length performance of the proposed Max-DGWMA chart is compared with that of the Max-EWMA, Max-DEWMA, Max-GWMA and SS-DGWMA charts, using time-varying control limits, through Monte–Carlo simulations. The comparisons reveal that the proposed chart is more efficient than the Max-EWMA, Max-DEWMA and Max-GWMA charts, while it is comparable with the SS-DGWMA chart. An automotive industry application is presented in order to implement the Max-DGWMA chart. The goal is to establish statistical control of the manufacturing process of the automobile engine piston rings. The source of the out-of-control signals is interpreted and the efficiency of the proposed chart in detecting shifts faster is evident. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Testing for genetic mutation of seasonal influenza virus.
- Author
-
Liu, Vera and Walker, Stephen
- Subjects
- *
SEASONAL influenza , *GENETIC mutation , *GENETIC testing , *INFLUENZA A virus , *INFLUENZA viruses , *VACCINE effectiveness , *INFLUENZA - Abstract
Influenza virus strains undergo genetic mutations every year and these changes in genetic makeup pose difficulties for effective vaccine selection. To better understand the problem it is important to statistically quantify the amount of genetic change between circulating strains from different years. In this paper, we propose the nonparametric crossmatch test applied to phylogenetic trees to assess the level of discrepancy between circulating flu virus strains between two years; the viruses being represented by a phylogenetic tree. The crossmatch test has advantages compared to parametric tests in that it preserves more information in the data. The outcome of the test would indicate whether the circulating influenza virus has mutated sufficiently in the past year to be considered as a new population of virus, suggesting the need to consider a new vaccine. We validate the test on simulated phylogenetic tree samples with varying branch lengths, as well as with publicly available virus sequence data from the 'Global Initiative on Sharing All Influenza Data' (GISAID: ) [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Regression models using the LINEX loss to predict lower bounds for the number of points for approximating planar contour shapes.
- Author
-
Jayasinghe, J. M. Thilini, Ellingson, Leif, and Prematilake, Chalani
- Subjects
- *
REGRESSION analysis , *INDEPENDENT variables , *LEAST squares , *APPROXIMATION error , *STATISTICS , *SAMPLING errors - Abstract
Researchers in statistical shape analysis often analyze outlines of objects. Even though these contours are infinite-dimensional in theory, they must be discretized in practice. When discretizing, it is important to reduce the number of sampling points considerably to reduce computational costs, but to not use too few points so as to result in too much approximation error. Unfortunately, determining the minimum number of points needed to achieve sufficiently approximate the contours is computationally expensive. In this paper, we fit regression models to predict these lower bounds using characteristics of the contours that are computationally cheap as predictor variables. However, least squares regression is inadequate for this task because it treats overestimation and underestimation equally, but underestimation of lower bounds is far more serious. Instead, to fit the models, we use the LINEX loss function, which allows us to penalize underestimation at an exponential rate while penalizing overestimation only linearly. We present a novel approach to select the shape parameter of the loss function and tools for analyzing how well the model fits the data. Through validation methods, we show that the LINEX models work well for reducing the underestimation for the lower bounds. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. A new class of efficient and debiased two-step shrinkage estimators: method and application.
- Author
-
Qasim, Muhammad, Månsson, Kristofer, Sjölander, Pär, and Kibria, B. M. Golam
- Subjects
- *
MULTICOLLINEARITY , *MONTE Carlo method , *LEAST squares , *REGRESSION analysis - Abstract
This paper introduces a new class of efficient and debiased two-step shrinkage estimators for a linear regression model in the presence of multicollinearity. We derive the proposed estimators' mean square error and define the necessary and sufficient conditions for superiority over the existing estimators. In addition, we develop an algorithm for selecting the shrinkage parameters for the proposed estimators. The comparison of the new estimators versus the traditional ordinary least squares, ridge regression, Liu, and the two-parameter estimators is done by a matrix mean square error criterion. The Monte Carlo simulation results show the superiority of the proposed estimators under certain conditions. In the presence of high but imperfect multicollinearity, the two-step shrinkage estimators' performance is relatively better. Finally, two real-world chemical data are analyzed to demonstrate the advantages and the empirical relevance of our newly proposed estimators. It is shown that the standard errors and the estimated mean square error decrease substantially for the proposed estimator. Hence, the precision of the estimated parameters is increased, which of course is one of the main objectives of the practitioners. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Online monitoring of high-dimensional binary data streams with application to extreme weather surveillance.
- Author
-
Fang, Zhiwen, Li, Wendong, Liu, Xin, Pu, Xiaolong, and Xiang, Dongdong
- Subjects
- *
EXTREME weather , *STATISTICAL process control , *QUALITY control charts , *MOVING average process , *COMPUTATIONAL complexity - Abstract
With the rapid development of modern sensor technology, high-dimensional data streams appear frequently nowadays, bringing urgent needs for effective statistical process control (SPC) tools. In such a context, the online monitoring problem of high-dimensional and correlated binary data streams is becoming very important. Conventional SPC methods for monitoring multivariate binary processes may fail when facing high-dimensional applications due to high computational complexity and the lack of efficiency. In this paper, motivated by an application in extreme weather surveillance, we propose a novel pairwise approach that considers the most informative pairwise correlation between any two data streams. The information is then integrated into an exponential weighted moving average (EWMA) charting scheme to monitor abnormal mean changes in high-dimensional binary data streams. Extensive simulation study together with a real-data analysis demonstrates the efficiency and applicability of the proposed control chart. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Robust and efficient estimation of GARCH models based on Hellinger distance.
- Author
-
Zhao, Qiang, Chen, Liang, and Wu, Jingjing
- Subjects
- *
GARCH model , *DATA scrubbing , *INFERENTIAL statistics - Abstract
It is well known that financial data frequently contain outlying observations. Almost all methods and techniques used to estimate GARCH models are likelihood-based and thus generally non-robust against outliers. Minimum distance method, as an important tool for statistical inferences and a competitive alternative for achieving robustness, has surprisingly not been well explored for GARCH models. In this paper, we proposed a minimum Hellinger distance estimator (MHDE) and a minimum profile Hellinger distance estimator (MPHDE), depending on whether the innovation distribution is specified or not, for estimating the parameters in GARCH models. The construction and investigation of the two estimators are quite involved due to the non-i.i.d. nature of data. We proved that the MHDE is a consistent estimator and derived its bias in explicit expression. For both of the proposed estimators, we demonstrated their finite-sample performance through simulation studies and compared with the well-established methods including MLE, Gaussian Quasi-MLE, Non-Gaussian Quasi-MLE and Least Absolute Deviation estimator. Our numerical results showed that MHDE and MPHDE have much better performance than MLE-based methods when data are contaminated while simultaneously they are very competitive when data is clean, which testified to the robustness and efficiency of the two proposed MHD-type estimations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Inference on moderation effect with third-variable effect analysis – application to explore the trend of racial disparity in oncotype dx test for breast cancer treatment.
- Author
-
Yu, Qingzhao, Zhang, Lu, Wu, Xiaocheng, and Li, Bin
- Subjects
- *
RACIAL inequality , *BREAST cancer , *MODERATION , *CANCER treatment , *RACIAL differences - Abstract
Third variable effect refers to the effect from a third variable that explains an observed relationship between an exposure and an outcome. Depending on whether there is causal relationship, typically, a third variable takes the format of a mediator or a confounder. A moderation effect is a special case of the third-variable effect, where the moderator and other variables have an interactive effect on the outcome. In this paper, we extend the R package 'mma' for moderation analysis so that third-variable effects can be reported at different levels of the moderator. The proposed moderation analysis use tree-structured models to automatically detect moderation effects and can handle both categorical and numerical moderators. We propose algorithms and graphical methods for making inference on moderation effects and illustrate the method under different scenarios of moderation effects. Finally, we apply the proposed method to explore the trend of racial disparities in the use of Oncotype DX recurrence tests among breast cancer patients. We found that the unexplained racial differences in using the tests have decreased from 2010 to 2015. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.