260 results on '"mean estimation"'
Search Results
2. Constructing a new estimator for estimating population mean utilizing auxiliary information in probability proportional to size sampling
- Author
-
Alghamdi, Safar M., Ahmad, Sohaib, Almarzouki, Sanaa Mohammed, Aloraini, Badr, Badr, Majdah Mohammed, and Abdelkawy, M.A.
- Published
- 2025
- Full Text
- View/download PDF
3. A novel approach for estimation Population Mean with Dual Use of in Stratified Random Sampling
- Author
-
Alzahrani, Mohammed R. and Almohaimeed, Mohammed
- Published
- 2025
- Full Text
- View/download PDF
4. Optimizing population mean estimation under ranked set sampling with applications to Engineering
- Author
-
Eisa, Muhammad, Iqbal, Muhammad, Ali, Hameed, Mahmood, Zafar, and Znaidia, Sami
- Published
- 2024
- Full Text
- View/download PDF
5. Enhancing mean estimators in median ranked set sampling with dual auxiliary information
- Author
-
Alharbi, Randa, Mustafa, Manahil SidAhmed, Al Mutairi, Aned, Hussein, Mohamed, Yusuf, M., Elshenawy, Assem, and Nassr, Said G.
- Published
- 2023
- Full Text
- View/download PDF
6. New Class of Estimators for Finite Population Mean Under Stratified Double Phase Sampling with Simulation and Real-Life Application.
- Author
-
Alghamdi, Abdulaziz S. and Alrweili, Hleil
- Subjects
- *
EXTREME value theory , *ESTIMATION theory , *ESTIMATION bias , *INFORMATION resources - Abstract
Sampling survey data can sometimes contain outlier observations. When the mean estimator becomes skewed due to the presence of extreme values in the sample, results can be biased. The tendency to remove outliers from sample data is common. However, performing such removal can reduce the accuracy of conventional estimating techniques, particularly with regard to the mean square error (MSE). In order to increase population mean estimation accuracy while taking extreme values into consideration, this study presents an enhanced class of estimators. The method uses extreme values from an auxiliary variable as a source of information rather than eliminating these outliers. Using a first-order approximation, the properties of the suggested class of estimators are investigated within the context of a stratified two-phase sampling framework. A simulation research is conducted to examine the practical performance of these estimators in order to validate the theoretical conclusions. To further demonstrate the superiority of the suggested class of estimators for dealing with extreme values, an analysis of three different datasets demonstrates that they consistently provide higher percent relative efficiency (PRE) when compared to existing estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
7. A Comparative Study of New Ratio-Type Family of Estimators Under Stratified Two-Phase Sampling.
- Author
-
Alghamdi, Abdulaziz S. and Alrweili, Hleil
- Subjects
- *
EXTREME value theory , *SAMPLING (Process) , *COMPARATIVE studies , *PERCENTILES - Abstract
Two-phase sampling is a useful technique for sample surveys, particularly when prior auxiliary data is not accessible. The ranks of the auxiliary variable often coincide with those of the research variable when two variables are correlated. By considering this relationship, we can significantly increase estimator accuracy. In this paper, we use the ranks of the auxiliary variable along with extreme values to estimate the population mean of the study variable. Up to a first-order approximation, we analyze the characteristics of the suggested class of estimators with an emphasis on biases and mean squared errors in stratified two-phase sampling. The theoretical results are verified using different datasets and a simulation study, which demonstrates that the proposed estimators outperform the existing ones in terms of percent relative efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
8. Optimal classes of memory-type estimators of population mean for temporal surveys
- Author
-
Anoop Kumar, Renu Kumari, and Abdullah Mohammed Alomair
- Subjects
mean square error ,exponentially weighted moving average ,simple random sampling ,mean estimation ,efficiency ,Mathematics ,QA1-939 - Abstract
In this article, we explore how to efficiently estimate the population mean utilizing past and current sample information through exponentially weighted moving average (EWMA) statistics in temporal surveys. We propose some optimal classes of memory-type estimators of population mean for temporal surveys within the framework of simple random sampling (SRS). We derive the expressions for the bias and mean square error (MSE) of the suggested estimators up to first-order approximation. We compare the traditional and newly introduced memory-type estimators and establish the efficiency conditions. Moreover, we conduct a thorough simulation study using real and artificial populations to refine our theoretical outcomes. The simulation results show that studying past and current sample data increase the efficiency of the proposed estimators.
- Published
- 2025
- Full Text
- View/download PDF
9. Efficient population mean estimation via stratified sampling with dual auxiliary information: A real estate perspective
- Author
-
G.R.V. Triveni and Faizan Danish
- Subjects
Two-fold Auxiliary Information ,Study Variable ,Mean Estimation ,Real Data ,Stratified Sampling ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Auxiliary information is an essential component in the field of survey sampling since it enables precise estimation of population parameters like mean, variance, distribution function, and so on, which in turn guarantees the best possible outcomes. In order to estimate the population mean of a study variable, this study makes use of auxiliary information in a two-fold approach. Through a stratified random sampling scheme, we introduce a novel class of estimators that utilize auxiliary information and their corresponding ranks. By conducting a thorough evaluation based on metrics such as mean square error and percentage relative efficiency, these proposed estimators have been shown to be effective in the estimation process. Empirical validation is conducted using a real dataset sourced from the domain of real estate. Exploring the relationship between Assessed Value (X) and Sale Amount (Y) during a five-year period extending from 2017 to 2021 is the primary emphasis of the empirical validation process, which is carried out with the assistance of a real dataset of real estate data. Furthermore, in order to demonstrate that our suggested estimator is superior to conventional unbiased estimators, as well as traditional regression estimators and other estimators that have been considered in the literature, a full simulation analysis is carried out. Our proposed estimator appears to be the most effective choice after being subjected to a comparison study against a variety of preexisting approaches. The findings of this study not only make a significant contribution to the development of the methodology of survey sampling but also offer vital insights for predictive modeling within the real estate sector.
- Published
- 2024
- Full Text
- View/download PDF
10. A Bias-Accuracy-Privacy Trilemma for Statistical Estimation.
- Author
-
Kamath, Gautam, Mouzakis, Argyris, Regehr, Matthew, Singhal, Vikrant, Steinke, Thomas, and Ullman, Jonathan
- Subjects
- *
DATA privacy , *STATISTICAL bias , *ESTIMATION bias , *PRIVACY , *NOISE - Abstract
AbstractDifferential privacy (DP) is a rigorous notion of data privacy, used for private statistics. The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean. Clipping controls the sensitivity and, hence, the variance of the noise that we add for privacy. But clipping also introduces statistical bias. This tradeoff is inherent: we prove that no algorithm can simultaneously have low bias, low error, and low privacy loss for arbitrary distributions. Additionally, we show that under strong notions of DP (i.e., pure or concentrated DP), unbiased mean estimation is impossible, even if we assume that the data is sampled from a Gaussian. On the positive side, we show that unbiased mean estimation is possible under a more permissive notion of differential privacy (approximate DP) if we assume that the distribution is symmetric. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Efficient population mean estimation via stratified sampling with dual auxiliary information: A real estate perspective.
- Author
-
Triveni, G.R.V. and Danish, Faizan
- Subjects
DISTRIBUTION (Probability theory) ,REAL estate business ,PARAMETERS (Statistics) ,REAL property ,EMPIRICAL research ,STATISTICAL sampling - Abstract
Auxiliary information is an essential component in the field of survey sampling since it enables precise estimation of population parameters like mean, variance, distribution function, and so on, which in turn guarantees the best possible outcomes. In order to estimate the population mean of a study variable, this study makes use of auxiliary information in a two-fold approach. Through a stratified random sampling scheme, we introduce a novel class of estimators that utilize auxiliary information and their corresponding ranks. By conducting a thorough evaluation based on metrics such as mean square error and percentage relative efficiency, these proposed estimators have been shown to be effective in the estimation process. Empirical validation is conducted using a real dataset sourced from the domain of real estate. Exploring the relationship between Assessed Value (X) and Sale Amount (Y) during a five-year period extending from 2017 to 2021 is the primary emphasis of the empirical validation process, which is carried out with the assistance of a real dataset of real estate data. Furthermore, in order to demonstrate that our suggested estimator is superior to conventional unbiased estimators, as well as traditional regression estimators and other estimators that have been considered in the literature, a full simulation analysis is carried out. Our proposed estimator appears to be the most effective choice after being subjected to a comparison study against a variety of preexisting approaches. The findings of this study not only make a significant contribution to the development of the methodology of survey sampling but also offer vital insights for predictive modeling within the real estate sector. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Estimation of finite population mean using dual auxiliary information under non-response with simple random sampling
- Author
-
Fatimah A. Almulhim, Hassan M. Aljohani, Ramy Aldallal, Manahil SidAhmed Mustafa, Meshayil M. Alsolmi, Assem Elshenawy, and Afaf Alrashidi
- Subjects
Mean estimation ,CDF ,Population ,Auxiliary ,Dual ,Information ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
This paper proposes a new family of estimators for population mean with a non-response using a simple random sampling. It specifically applies this method to estimate the mean in Turkey’s Education sector data sets, accounting for non-response. The study integrates additional information in the form of mean and cdf of the auxiliary variable which is highly positively correlated with the study variable to develop a general class of estimators tailored for the non-response using a simple random sampling scheme. Through numerical research, the characteristics of these estimators – namely, their biases and mean square errors – have been carefully investigated and deeply evaluated. The basic estimator of a population mean is under-performed by the suggested estimators. The empirical investigation validates the results, demonstrating the improved relative efficiency of the proposed estimator over the existing estimators, as demonstrated by theoretical and numerical comparisons.
- Published
- 2024
- Full Text
- View/download PDF
13. Federated computation: a survey of concepts and challenges.
- Author
-
Bharadwaj, Akash and Cormode, Graham
- Subjects
DATA privacy ,DATA management ,DATA analytics ,PRIVACY ,INFORMATION sharing - Abstract
Federated Computation is an emerging area that seeks to provide stronger privacy for user data, by performing large scale, distributed computations where the data remains in the hands of users. Only the necessary summary information is shared, and additional security and privacy tools can be employed to provide strong guarantees of secrecy. The most prominent application of federated computation is in training machine learning models (federated learning), but many additional applications are emerging, more broadly relevant to data management and querying data. This survey gives an overview of federated computation models and algorithms. It includes an introduction to security and privacy techniques and guarantees, and shows how they can be applied to solve a variety of distributed computations providing statistics and insights to distributed data. It also discusses the issues that arise when implementing systems to support federated computation, and open problems for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Estimation of finite population mean using dual auxiliary information under non-response with simple random sampling.
- Author
-
Almulhim, Fatimah A., Aljohani, Hassan M., Aldallal, Ramy, Mustafa, Manahil SidAhmed, Alsolmi, Meshayil M., Elshenawy, Assem, and Alrashidi, Afaf
- Subjects
STATISTICAL sampling ,NONRESPONSE (Statistics) - Abstract
This paper proposes a new family of estimators for population mean with a non-response using a simple random sampling. It specifically applies this method to estimate the mean in Turkey's Education sector data sets, accounting for non-response. The study integrates additional information in the form of mean and cdf of the auxiliary variable which is highly positively correlated with the study variable to develop a general class of estimators tailored for the non-response using a simple random sampling scheme. Through numerical research, the characteristics of these estimators – namely, their biases and mean square errors – have been carefully investigated and deeply evaluated. The basic estimator of a population mean is under-performed by the suggested estimators. The empirical investigation validates the results, demonstrating the improved relative efficiency of the proposed estimator over the existing estimators, as demonstrated by theoretical and numerical comparisons. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Multivariate mean estimation with direction-dependent accuracy.
- Author
-
Lugosi, Gábor and Mendelson, Shahar
- Subjects
- *
MULTIVARIATE analysis , *VECTOR analysis , *MATHEMATICS , *MATHEMATICAL equivalence , *PROBABILITY theory - Abstract
We consider the problem of estimating the mean of a random vector based on N independent, identically distributed observations. We prove the existence of an estimator that has a nearoptimal error in all directions in which the variance of the one-dimensional marginal of the random vector is not too small: with probability 1-δ, the procedure returns μ N which satisfies, for every direction u∈Sd-1, ... where σ²(u)=Var(X,u) and C is a constant. To achieve this, we require only slightly more than the existence of the covariance matrix, in the form of a certain moment-equivalence assumption. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. vqSGD: Vector Quantized Stochastic Gradient Descent
- Author
-
Gandikota, Venkata, Kane, Daniel, Maity, Raj Kumar, and Mazumdar, Arya
- Subjects
Engineering ,Information and Computing Sciences ,Communications Engineering ,Computer Vision and Multimedia Computation ,Machine Learning ,Vector quantization ,communication efficiency ,mean estimation ,stochastic gradient descent ,Artificial Intelligence and Image Processing ,Electrical and Electronic Engineering ,Communications Technologies ,Networking & Telecommunications ,Communications engineering ,Theory of computation - Published
- 2022
17. RCP:Mean Value Protection Technology Under Local Differential Privacy
- Author
-
LIU Likang, ZHOU Chunlai
- Subjects
local differential privacy ,mean estimation ,random response ,random censoring ,utility optimization ,Computer software ,QA76.75-76.765 ,Technology (General) ,T1-995 - Abstract
This paper mainly focuses on the mean estimation problem in differential privacy query.After introducing the current mainstream local differential privacy design scheme of numerical data mean estimation,it first introduces the random censoring mechanism in random response technology to reveal the basic principle of mean calculation under local differential privacy,proposes a utility optimization theorem about the variance of mean estimation,and gives a boundary optimization formula,which improves the interpretability and operability of utility optimization theory in this field.Based on this theory,this paper proposes a practical,concise and efficient mean estimation algorithm protocol RCP for the first time,which can be used to collect and analyze the data of intelligent device users connected to the Internet,while meeting the requirements of local differential privacy.RCP is simple in structure,supports data analysis tasks on any number of numerical attributes,and has efficient communication and calculation,effectively alleviating the practical problems of complex algorithm design,difficult optimization,and low efficiency.Finally,empirical research demonstrates that the proposed method outperforms other existing schemes in terms of utility,efficiency and asymptotic error bounds.
- Published
- 2023
- Full Text
- View/download PDF
18. Ratio-Type Estimator for Estimating the Neutrosophic Population Mean in Simple Random Sampling under Intuitionistic Fuzzy Cost Function.
- Author
-
Ullah, Atta, Shabbir, Javid, Alomair, Abdullah Muhammad, and Alomair, Muhammad Ahmed
- Subjects
- *
STATISTICAL sampling , *COST functions , *EXTREME value theory , *INFERENTIAL statistics , *PARAMETER estimation , *SAMPLE size (Statistics) - Abstract
Survey sampling has a wide range of applications in biomedical, meteorological, stock exchange, marketing, and agricultural research based on data collected through sample surveys or experimentation. The collected set of information may have a fuzzy nature, be indeterminate, and be summarized by a fuzzy number rather than a crisp value. The neutrosophic statistics, a generalization of fuzzy statistics and classical statistics, deals with the data that have some degree of indeterminacy, imprecision, and fuzziness. In this article, we introduce a fuzzy decision-making approach for deciding a sample size under a fuzzy measurement cost modeled by an intuitionistic fuzzy cost function. Our research introduces neutrosophic ratio-type estimators for estimating the population mean of the neutrosophic study variable Y N ∈ [ Y L , Y U ] utilizing all the indeterminate values of the neutrosophic auxiliary variable X N ∈ [ X L , X U ] rather than only the extreme values X L and X U . Three simulation studies are carried out to explain the proposed methods of parameter estimation, sample size determination, and efficiency comparison. The results reveal that the proposed neutrosophic class of estimators produces more accurate and precise estimates of the neutrosophic population mean than the existing neutrosophic estimators in simple random sampling, which is the ultimate goal of inferential statistics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Neutrosophic Mean Estimation of Sensitive and Non-Sensitive Variables with Robust Hartley–Ross-Type Estimators.
- Author
-
Alomair, Abdullah Mohammed and Shahzad, Usman
- Subjects
- *
MEASUREMENT errors , *OUTLIER detection , *GENERALIZATION - Abstract
Under classical statistics, research typically relies on precise data to estimate the population mean when auxiliary information is available. Outliers can pose a significant challenge in this process. The ultimate goal is to determine the most accurate estimates of the population mean while minimizing variance. Neutrosophic statistics is a generalization of classical statistics that deals with imprecise, uncertain data. Our research introduces the neutrosophic Hartley–Ross-type ratio estimators for estimating the population mean of neutrosophic data, even in the presence of outliers. We also incorporate neutrosophic versions of several robust regression methods, including LAD, Huber-M, Hampel-M, and Tukey-M. Our approach assumes that the study variable is both non-sensitive and sensitive, meaning that it can cause discomfort to participants during personal interviews, and measurement errors can occur due to dishonest responses. To address potential measurement errors, we propose the use of neutrosophic scrambling response models. Our proposed neutrosophic robust estimators are more effective than existing classical estimators, as confirmed by a computer-based numerical study using real data and simulation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Communication-Efficient and Private Distributed Learning
- Author
-
Bebawy, Antonious Mamdouh Girgis
- Subjects
Mathematics ,Computer science ,Computer engineering ,Bandits ,Differential Privacy ,Federated Learning ,Mean Estimation ,Privacy ,Shuffle Model - Abstract
We are currently facing a rapid growth of data contents originating from edge devices. These data resources offer significant potential for learning and extracting complex patterns in a range of distributed learning applications, such as healthcare, recommendation systems, and financial markets. However, the collection and processing of such extensive datasets through centralized learning procedures imposes potential challenges. As a result, there is a need for the development of distributed learning algorithms. Furthermore, This raises two principal challenges within the realm of distributed learning. The first challenge is to provide privacy guarantees for clients' data, as it may contain sensitive information that can be potentially mishandled. The second challenge involves addressing communication constraints, particularly in cases where clients are connected to a coordinator through wireless/band-limited networks. In this thesis, our objective is to develop fundamental information-theoretic bounds and devise distributed learning algorithms with privacy and communication requirements while maintaining the overall utility performance. We consider three different adversary models for differential privacy: (1) central model, where the exists a trusted server applies a private mechanism after collecting the raw data; (2) local model, where each client randomizes her own data before making it public; (3) shuffled model, where there exists a trusted shuffler that randomly permutes the randomized data before publishing them. The contributions of this thesis can be summarized as follows \begin{itemize}\item We propose communication-efficient algorithms for estimating the mean of bounded $\ell_p$-norm vectors under privacy constraints in the local and shuffled models for $p\in[1,\infty]$. We also provide information-theoretic lower bounds showing that our algorithms have order-optimal privacy-communication-performance trade-offs. In addition, we present a generic algorithm for distributed mean estimation under user-level privacy constraints when each client has more than one data point.\item We propose a distributed optimization algorithm to solve the empirical risk minimization(ERM) problem with communication and privacy guarantees and analyze its communication-privacy-convergence trade-offs. We extend our distributed algorithm for a client-self-sampling scheme that fits federated learning frameworks, where each client independently decides to contribute at each round based on tossing a biased coin. We also propose a user-level private algorithm for personalized federated learning. \item We characterize the r\'enyi differential privacy (RDP) of the shuffled model by proposing closed-form upper and lower bounds for general local randomized mechanisms. RDP is a useful privacy notion that enables a much tighter composition for interactive mechanisms. Furthermore, we characterize the RDP of the subsampled shuffled model that combines privacy amplification via shuffling and amplification by subsampling. \item We propose differentially private algorithms for the problem of stochastic linear bandits in the central, local, and shuffled models. Our algorithms achieve almost the same regret as the optimal non-private algorithms in the central and shuffled models, which means we get privacy for free. \item We study successive refinement of privacy by providing hierarchical access to the raw data with differentprivacy levels. We provide (order-wise) tightcharacterizations of privacy-utility-randomness trade-offs in several cases of discrete distribution estimation.\end{itemize}
- Published
- 2023
21. Correlation Analysis for Key-Value Data with Local Differential Privacy
- Author
-
SUN Lin, PING Guo-lou, YE Xiao-jun
- Subjects
local differential privacy ,key-value data ,correlation analysis ,mean estimation ,frequency estimation ,Computer software ,QA76.75-76.765 ,Technology (General) ,T1-995 - Abstract
Crowdsourced data from distributed sources are routinely collected and analyzed to produce effective data-mining mo-dels in crowdsensing systems.Data usually contains personal information,which leads to possible privacy leakage in data collection and analysis.The local differential privacy (LDP) has been deemed as the de facto measure for trade-off between privacy guarantee and data utility.Currently,the key-value data is a kind of heterogeneous data types in which the key is categorical data and the value is numerical data.Achieving LDP for key-value data is challenging.This paper focuses on key-value data publishing and correlation analysis under the framework of LDP.Firstly,the frequency correlation and mean correlation in key-value data are defined.Then the indexing one-hot perturbation mechanism is proposed to provide LDP guarantees.At last,the correlation results can be estimated in the perturbed space.Theoretical analysis and experimental results on both real-word and synthetic dataset va-lidate the effectiveness of proposed mechanism.
- Published
- 2021
- Full Text
- View/download PDF
22. New double stage ranked set sampling for estimating the population mean.
- Author
-
Hanandeh, Ahmad A., Al-Nasser, Amjad D., and Al-Omari, Amer I.
- Subjects
- *
MONTE Carlo method , *SAMPLING (Process) - Abstract
In environmental and many other areas, the main focus of survey is to measure elements using an efficient and cost-effective sampling technique. One way to reach that is by using Ranked set sampling (RSS). RSS is an alternative sampling technique that can be advantageous when measuring the variable of interest is either costly or time-consuming but ranking small sets of units according to the character under investigation by eye or other methods not requiring actual quantifications. The purpose of this article is to introduce a new modification of RSS to estimate the mean of the target population. This proposed technique is a double-stage approach that combines median RSS (MRSS) and MiniMax RSS (MMRSS). The performance of the empirical mean and variance estimators based on the proposed technique are compared with their counterparts in Double RSS (DRSS), Extreme RSS (ERSS), Double Extreme RSS (DERSS), MMRSS, RSS, and simple random sampling (SRS) via Monte Carlo simulation. Simulation results revealed that this new modification is almost always more efficient than their counterparts using MMRSS and SRS, while it is more efficient than RSS in many cases especially when the distribution is asymmetric. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Quantum Non-Identical Mean Estimation: Efficient Algorithms and Fundamental Limits
- Author
-
Jiachen Hu and Tongyang Li and Xinzhao Wang and Yecheng Xue and Chenyi Zhang and Han Zhong, Hu, Jiachen, Li, Tongyang, Wang, Xinzhao, Xue, Yecheng, Zhang, Chenyi, Zhong, Han, Jiachen Hu and Tongyang Li and Xinzhao Wang and Yecheng Xue and Chenyi Zhang and Han Zhong, Hu, Jiachen, Li, Tongyang, Wang, Xinzhao, Xue, Yecheng, Zhang, Chenyi, and Zhong, Han
- Published
- 2024
- Full Text
- View/download PDF
24. Generalized Estimator for Population Mean Using Auxiliary Attribute in Stratified Two-Phase Sampling.
- Author
-
Rana, Quratulain, Qureshi, Muhammad Nouman, and Hanif, Muhammad
- Subjects
SAMPLING theorem ,MATHEMATICAL models ,STATISTICAL correlation ,INFORMATION retrieval ,PARAMETER estimation - Abstract
In sampling theory, the auxiliary information is widely used to increase precision of the estimators when the study variable is correlated with the auxiliary variable. In several practical situations, the auxiliary information is available in the form of the auxiliary attribute(s). In this paper, we proposed a generalized estimator for the estimation of finite population mean using the auxiliary attribute under stratified two-phase sampling. The mathematical expressions of approximate bias and mean square error (MSE) of the proposed estimator are obtained to the first-order. Many special cases of the proposed estimator are also obtained using the known parameters of the auxiliary attribute. The algebraic comparisons of the MSE of the proposed estimator have been made the MSE of the competing estimators. The performance of the proposed estimator is evaluated using real data-sets. The results illustrate the good performance of the proposed estimators and its sub-cases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Long-term traffic flow estimation: a hybrid approach using location-based traffic characteristic.
- Author
-
AYAR, Tuğberk, ATLİNAR, Ferhat, GÜVENSAN, M. Amaç, and TÜRKMEN, H. İrem
- Subjects
- *
TRAFFIC flow , *TRAFFIC estimation , *TRAFFIC speed , *CITY traffic , *URBAN planning , *ERROR rates - Abstract
Traffic speed estimation plays a key role in various situations, ranging from individual's trip planning to urban traffic management. Despite many studies on short-term prediction, there is only a limited number of studies focusing on long-term prediction and only a couple of them does go beyond 24 h. On the contrary, this study presents a novel hybrid architecture using location-based traffic characteristic for traffic speed estimation up to 7 days. In this architecture, the introduced mean filtering estimation (MFE) model and long short-term memory (LSTM) neural network are jointly utilized for minimizing the error for traffic flow estimation. Both MFE and LSTM utilizes the speed data, collected from roadside sensors in İstanbul, of previous weeks that have the same weekday and the same time with target time to be predicted. Results in this study indicate that the use of MFE gives lower error rates for locations with low traffic complexity while LSTM outperforms MFE model for locations with high traffic complexity. Thanks to the introduced MFE and the proposed hybrid architecture, we are able to predict the speed data of a given location with an error of lower than +/- 10 km/h. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. MODIFIED MINIMAX RANKED SET SAMPLING.
- Author
-
Hanandeh, Ahmad A. and Al-Nasser, Amjad D.
- Subjects
- *
MONTE Carlo method , *QUALITY control charts , *DISTRIBUTION (Probability theory) , *SAMPLING (Process) , *STATISTICAL sampling - Abstract
Based on the traditional ranked set sampling (RSS), Al-Nasser and Al-Omari [1] have recently presented a cost-effective sampling technique called MiniMax RSS (MMRSS). However, MMRSS has some drawbacks when the distribution is asymmetric. To overcome this situation, in this article, we consider developing a modified version of MMRSS (MMMRSS). Monte Carlo simulations from numerous symmetric and asymmetric distributions are employed to assess the performance of the suggested MMMRSS mean estimator. Simulation findings demonstrated that MMMRSS estimator is more efficient than their counterparts using simple random sample (SRS) and MMRSS for all distributions considered in this article. Moreover, we have constructed Quality control charts to monitor the process mean based on the suggested MMMRSS. The performance of the average run length (ARL) of these new charts was compared with the control charts based on several sampling techniques. The results, based on a simulation study, indicate that our suggested MMMRSS control charts performed the best in detecting changes in process mean in most simulated scenarios. A real-life application concerning the global temperature is also provided as an illustration of the suggested charts. [ABSTRACT FROM AUTHOR]
- Published
- 2021
27. Flexible Signal Denoising via Flexible Empirical Bayes Shrinkage.
- Author
-
Zhengrong Xing, Carbonetto, Peter, and Stephens, Matthew
- Subjects
- *
SIGNAL denoising , *EMPIRICAL Bayes methods , *DATA distribution - Abstract
Signal denoising--also known as non-parametric regression--is often performed through shrinkage estimation in a transformed (e.g., wavelet) domain; shrinkage in the transformed domain corresponds to smoothing in the original domain. A key question in such applications is how much to shrink, or, equivalently, how much to smooth. Empirical Bayes shrinkage methods provide an attractive solution to this problem; they use the data to estimate a distribution of underlying "effects," hence automatically select an appropriate amount of shrinkage. However, most existing implementations of empirical Bayes shrinkage are less flexible than they could be--both in their assumptions on the underlying distribution of effects, and in their ability to handle heteroskedasticity--which limits their signal denoising applications. Here we address this by adopting a particularly flexible, stable and computationally convenient empirical Bayes shrinkage method and applying it to several signal denoising problems. These applications include smoothing of Poisson data and heteroskedastic Gaussian data. We show through empirical comparisons that the results are competitive with other methods, including both simple thresholding rules and purpose-built empirical Bayes procedures. Our methods are implemented in the R package smashr, "SMoothing by Adaptive SHrinkage in R," available at https://www.github.com/stephenslab/smashr. [ABSTRACT FROM AUTHOR]
- Published
- 2021
28. Mean Estimation and Regression Under Heavy-Tailed Distributions: A Survey.
- Author
-
Lugosi, Gábor and Mendelson, Shahar
- Subjects
- *
STATISTICAL learning , *QUANTILE regression - Abstract
We survey some of the recent advances in mean estimation and regression function estimation. In particular, we describe sub-Gaussian mean estimators for possibly heavy-tailed data in both the univariate and multivariate settings. We focus on estimators based on median-of-means techniques, but other methods such as the trimmed-mean and Catoni's estimators are also reviewed. We give detailed proofs for the cornerstone results. We dedicate a section to statistical learning problems—in particular, regression function estimation—in the presence of possibly heavy-tailed data. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
29. Randomly selected order statistics in ranked set sampling: A less expensive comparable alternative to simple random sampling.
- Author
-
Amiri, Saeid, Jafari Jozani, Mohammad, and Modarres, Reza
- Subjects
STATISTICAL sampling ,MEAN square algorithms ,ORDER statistics ,NUMERICAL analysis ,FUNCTIONAL equations - Abstract
Rank-based sampling designs are powerful alternatives to simple random sampling (SRS) and often provide large improvements in the precision of estimators. In many environmental, ecological, agricultural, industrial and/or medical applications the interest lies in sampling designs that are cheaper than SRS and provide comparable estimates. In this paper, we propose a new variation of ranked set sampling (RSS) for estimating the population mean based on the random selection technique to measure a smaller number of observations than RSS design. We study the properties of the population mean estimator using the proposed design and provide conditions under which the mean estimator performs better than SRS and some existing rank-based sampling designs. Theoretical results are augmented with some numerical studies and a real-life example, where we also study the performance of our proposed design under perfect and imperfect ranking situations. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
30. Improved estimators for mean estimation in presence of missing information
- Author
-
Garib Nath Singh, A. K. Pandey, Hanaa Abu-Zinadah, and Neveen Sayed-Ahmed
- Subjects
Computer science ,020209 energy ,Population ,Efficiency ,02 engineering and technology ,01 natural sciences ,010305 fluids & plasmas ,Mean estimation ,0103 physical sciences ,Statistics ,0202 electrical engineering, electronic engineering, information engineering ,Imputation (statistics) ,Special case ,education ,Imputation ,education.field_of_study ,Population mean ,General Engineering ,Estimator ,Engineering (General). Civil engineering (General) ,Missing data ,Exponential type ,Missing information ,Auxiliary variable ,TA1-2040 - Abstract
The treatment of incomplete data is an important step in statistical data analysis of most survey datasets. Missing values creates a boisterous situation for the survey researchers in producing the precise estimate of the desired population parameters. To handle these situations, imputation methods play a significant role in filling incomplete response values when it is necessary to use information on complete sampled units and not to discard the data with missingness. Keeping this in mind, our motive is to propose various improved exponential type imputation methods and the corresponding resultant estimators by using ancillary information. The properties (biases and mean square errors) of developed estimators have been examined. It has been shown that the estimators of population mean under similar circumstances due to Prasad [1-3] and some other estimators are special case of our suggested class of estimators. Results are obtained by using simulation studies and it shows the desired performance over others.
- Published
- 2021
- Full Text
- View/download PDF
31. On Using the Conventional and Nonconventional Measures of the Auxiliary Variable for Mean Estimation
- Author
-
Sat Gupta, Javid Shabbir, and Ronald Onyango
- Subjects
Article Subject ,Population mean ,General Mathematics ,General Engineering ,Estimator ,Survey research ,Engineering (General). Civil engineering (General) ,01 natural sciences ,Large sample ,010101 applied mathematics ,Auxiliary variables ,010104 statistics & probability ,Mean estimation ,QA1-939 ,Applied mathematics ,TA1-2040 ,0101 mathematics ,Mathematics - Abstract
In this paper, we propose an improved new class of exponential-ratio-type estimators for estimating the finite population mean using the conventional and the nonconventional measures of the auxiliary variable. Expressions for the bias and MSE are obtained under large sample approximation. Both simulation and numerical studies are conducted to validate the theoretical findings. Use of the conventional and the nonconventional measures of the auxiliary variable is very common in survey research, but we observe that this does not add much value in many of the estimators except for our proposed class of estimators.
- Published
- 2021
- Full Text
- View/download PDF
32. A Comprehensive Survey on Local Differential Privacy
- Author
-
Xiaoguang Niu, Xingxing Xiong, Zhaohui Cai, Dan Li, and Shubo Liu
- Subjects
Estimation ,021110 strategic, defence & security studies ,Science (General) ,Computer Networks and Communications ,Statistical learning ,Computer science ,business.industry ,Big data ,0211 other engineering and technologies ,020206 networking & telecommunications ,Multivariate normal distribution ,02 engineering and technology ,Data science ,Oracle ,Q1-390 ,Mean estimation ,Preservation Technique ,0202 electrical engineering, electronic engineering, information engineering ,T1-995 ,Differential privacy ,business ,Technology (General) ,Information Systems - Abstract
With the advent of the era of big data, privacy issues have been becoming a hot topic in public. Local differential privacy (LDP) is a state-of-the-art privacy preservation technique that allows to perform big data analysis (e.g., statistical estimation, statistical learning, and data mining) while guaranteeing each individual participant’s privacy. In this paper, we present a comprehensive survey of LDP. We first give an overview on the fundamental knowledge of LDP and its frameworks. We then introduce the mainstream privatization mechanisms and methods in detail from the perspective of frequency oracle and give insights into recent studied on private basic statistical estimation (e.g., frequency estimation and mean estimation) and complex statistical estimation (e.g., multivariate distribution estimation and private estimation over complex data) under LDP. Furthermore, we present current research circumstances on LDP including the private statistical learning/inferencing, private statistical data analysis, privacy amplification techniques for LDP, and some application fields under LDP. Finally, we identify future research directions and open challenges for LDP. This survey can serve as a good reference source for the research of LDP to deal with various privacy-related scenarios to be encountered in practice.
- Published
- 2020
- Full Text
- View/download PDF
33. Small area mean estimation after effect clustering
- Author
-
Jiahua Chen and Zhihuang Yang
- Subjects
Statistics and Probability ,021103 operations research ,0211 other engineering and technologies ,02 engineering and technology ,Articles ,01 natural sciences ,010104 statistics & probability ,Mean estimation ,Small area estimation ,After effect ,Statistics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Cluster analysis ,Mathematics - Abstract
Providing reliable estimates of subpopulation/area parameters has attracted increased attention due to their importance in applications such as policymaking. Due to low or even no samples from some areas, we must adopt indirect model approaches. Existing indirect small area estimation methods often assume that a single nested error regression model is suitable for all the small areas. In particular, the effects of the auxiliary variables are either fixed or have a single attraction center. In some applications, it can be more appropriate to cluster the small areas so that the effects of the auxiliary variables are fixed but have multiple centers in the nested error regression model. In this paper, we examine an extended nested error regression model in which the auxiliary variables have mixed effects with multiple centers. We use a penalty approach to identify these centers and estimate the model parameters simultaneously. We then propose two new small area mean estimators and construct estimators of their mean square errors. Simulations based on artificial and realistic finite populations show that the new estimators can be efficient. Furthermore, the confidence intervals based on the new methods have accurate coverage probabilities. We illustrate the proposed methods with the Survey of Labour and Income Dynamics conducted in Canada.
- Published
- 2022
34. Nonparametric Mean Estimation for Big-but-Biased Data
- Author
-
Laura Borrajo and Ricardo Cao
- Subjects
Bias Correction ,Big Data ,Kernel Method ,mean estimation ,Nonparametric Inference ,General Works - Abstract
Some authors have recently warned about the risks of the sentence with enough data, the numbers speak for themselves. The problem of nonparametric statistical inference in big data under the presence of sampling bias is considered in this work. The mean estimation problem is studied in this setup, in a nonparametric framework, when the biasing weight function is unknown (realistic). The problem of ignoring the weight function is remedied by having a small SRS of the real population. This problem is related to nonparametric density estimation. The asymptotic expression for the MSE of the estimator proposed is considered. Some simulations illustrate the performance of the nonparametric method proposed in this work.
- Published
- 2018
- Full Text
- View/download PDF
35. Key-value data collection and statistical analysis with local differential privacy.
- Author
-
Zhu, Hui, Tang, Xiaohu, Yang, Laurence Tianruo, Fu, Chao, and Peng, Shuangrong
- Subjects
- *
STATISTICS , *ACQUISITION of data , *STATISTICAL accuracy , *BUDGET , *PRIVACY , *DATA analysis - Abstract
The collection and statistical analysis of simple data types (e.g., categorical, numerical and multi-dimensional data) under local differential privacy has been widely studied. Recently, researchers have focused on the collection of the key-value data, which is one of the main types of NoSQL data model. In the collection and statistical analysis of key-value data under local differential privacy, the frequency and mean of each key must be estimated simultaneously. However, achieving a good utility-privacy tradeoff is difficult, because key-value data has inherent correlation, and some users may have different numbers of key-value pairs. In this paper, we propose an efficient sampling based scheme for collecting and analyzing key-value data. Note that the more valid data collected, the higher the accuracy of statistical data under the same disturbance level and disturbance algorithm. Therefore, we make full use of probability sampling and the inherent correlation of key-value data to improve the probability of users submitting valid key-value data. Moreover, we optimize the budget allocation on key-value data, so that the overall variance of frequency and mean estimation is close to optimal. Detailed theoretical analysis and experimental results show that the proposed scheme is superior to existing schemes in accuracy. • We propose an efficient SKV-GRR scheme with separate key and value selection for collecting and analyzing key-value data. • In the key selection, we use unequal probability sampling to improve the probability of users submitting valid data. • The value selection based on weak correlated perturbation can improve the probability of users submitting valid value data. • We optimize the budget allocation on the selected key and the selected value to improve the accuracy of estimated data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Two nonparametric approaches to mean absolute deviation portfolio selection model
- Author
-
Huan Zhu, Zhifeng Dai, and Fenghua Wen
- Subjects
0209 industrial biotechnology ,021103 operations research ,Control and Optimization ,Applied Mathematics ,Strategy and Management ,0211 other engineering and technologies ,Nonparametric statistics ,02 engineering and technology ,Atomic and Molecular Physics, and Optics ,Absolute deviation ,Mean estimation ,020901 industrial engineering & automation ,Computer Science::Computational Engineering, Finance, and Science ,Stock exchange ,Kernel (statistics) ,Econometrics ,Portfolio ,Business and International Management ,Electrical and Electronic Engineering ,Mathematics - Abstract
In this paper, we apply two nonparametric approaches to mean absolute deviation (MAD) portfolio selection model. The first one is to use the nonparametric kernel mean estimation to replace the returns of assets with five different kernel functions. Then, we construct the nonparametric kernel mean estimation-based MAD portfolio model. The second one is to utilize the nonparametric kernel median estimation to replace the returns of assets with five different kernel functions. Then, we construct the nonparametric kernel median estimation-based MAD portfolio model. We also extend the two kinds of nonparametric approach to mean-Conditional Value-at-Risk portfolio model. Finally, we give the in-sample and out-of-sample analysis of the proposed strategies and compare the performance of the proposed models by using actual stock returns in Shanghai stock exchange of China. The experimental results show the nonparametric estimation-based portfolio models are more efficient than the original portfolio model.
- Published
- 2020
- Full Text
- View/download PDF
37. Refining Mean-field Approximations by Dynamic State Truncation
- Author
-
TribastoneMirco, RandoneFrancesca, BortolussiLuca, Randone, F., Bortolussi, L., and Tribastone, M.
- Subjects
Computer Networks and Communications ,Truncation ,Computer science ,Markov population processe ,mean-field model ,02 engineering and technology ,state-space truncation ,01 natural sciences ,Mean estimation ,010104 statistics & probability ,mean-field models ,Simple (abstract algebra) ,Master equation ,Computer Science (miscellaneous) ,0202 electrical engineering, electronic engineering, information engineering ,Applied mathematics ,State space ,Limit (mathematics) ,0101 mathematics ,Safety, Risk, Reliability and Quality ,Mathematics ,Refining (metallurgy) ,Markov population processes ,020206 networking & telecommunications ,State (functional analysis) ,Mean field theory ,Hardware and Architecture ,mean estimation ,Orbit (dynamics) ,Probability distribution ,Software - Abstract
Mean-field models are an established method to analyze large stochastic systems with N interacting objects by means of simple deterministic equations that are asymptotically correct when N tends to infinity. For finite N, mean-field equations provide an approximation whose accuracy is model- and parameter-dependent. Recent research has focused on refining the approximation by computing suitable quantities associated with expansions of order $1/N$ and $1/N^2$ to the mean-field equation. In this paper we present a new method for refining mean-field approximations. It couples the master equation governing the evolution of the probability distribution of a truncation of the original state space with a mean-field approximation of a time-inhomogeneous population process that dynamically shifts the truncation across the whole state space. We provide a result of asymptotic correctness in the limit when the truncation covers the state space; for finite truncations, the equations give a correction of the mean-field approximation. We apply our method to examples from the literature to show that, even with modest truncations, it is effective in models that cannot be refined using existing techniques due to non-differentiable drifts, and that it can outperform the state of the art in challenging models that cause instability due orbit cycles in their mean-field equations.
- Published
- 2021
- Full Text
- View/download PDF
38. Concentration study of M-estimators using the influence function
- Author
-
Mathieu, Timothée, Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Scool (Scool), Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Statistique mathématique et apprentissage (CELESTE), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de Mathématiques d'Orsay (LMO), and Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Statistics and Probability ,62F35 (Primary) 60G25 (Secondary) ,MSC2020 subject classifications: Primary 62F35 ,secondary 60G25 ,concentration inequalities ,mean estimation ,FOS: Mathematics ,Mathematics - Statistics Theory ,Robust Statistics ,Statistics Theory (math.ST) ,Statistics, Probability and Uncertainty ,[MATH]Mathematics [math] - Abstract
We present a new finite-sample analysis of M-estimators of locations in $\mathbb{R}^d$ using the tool of the influence function. In particular, we show that the deviations of an M-estimator can be controlled thanks to its influence function (or its score function) and then, we use concentration inequality on M-estimators to investigate the robust estimation of the mean in high dimension in a corrupted setting (adversarial corruption setting) for bounded and unbounded score functions. For a sample of size $n$ and covariance matrix $\Sigma$, we attain the minimax speed $\sqrt{Tr(\Sigma)/n}+\sqrt{\|\Sigma\|_{op}\log(1/\delta)/n}$ with probability larger than $1-\delta$ in a heavy-tailed setting. One of the major advantages of our approach compared to others recently proposed is that our estimator is tractable and fast to compute even in very high dimension with a complexity of $O(nd\log(Tr(\Sigma)))$ where $n$ is the sample size and $\Sigma$ is the covariance matrix of the inliers. In practice, the code that we make available for this article proves to be very fast.
- Published
- 2021
- Full Text
- View/download PDF
39. Robust multivariate mean estimation: The optimality of trimmed mean
- Author
-
Gábor Lugosi and Shahar Mendelson
- Subjects
Statistics and Probability ,Multivariate statistics ,Mean estimation ,Multivariate random variable ,Truncated mean ,Estimator ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,Extension (predicate logic) ,robust estimation ,62G08 ,Statistics ,FOS: Mathematics ,62J02 ,60G25 ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
We consider the problem of estimating the mean of a random vector based on i.i.d. observations and adversarial contamination. We introduce a multivariate extension of the trimmed-mean estimator and show its optimal performance under minimal conditions.
- Published
- 2021
40. Mean Field and Refined Mean Field Approximations for Heterogeneous Systems: It Works!
- Author
-
Sebastian Allmeier, Nicolas Gast, Performance analysis and optimization of LARge Infrastructures and Systems (POLARIS), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'Informatique de Grenoble (LIG), Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), and ANR-19-CE23-0015,REFINO,Optimisation grace au Champ Moyen Raffiné(2019)
- Subjects
FOS: Computer and information sciences ,Computer Science - Performance ,Markov population processes ,Computer Networks and Communications ,Probability (math.PR) ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,Performance (cs.PF) ,[INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF] ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,mean field models ,mean estimation ,Hardware and Architecture ,mean field approximation ,Computer Science (miscellaneous) ,FOS: Mathematics ,[MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC] ,heterogeneity ,[MATH]Mathematics [math] ,Safety, Risk, Reliability and Quality ,Software ,Mathematics - Probability - Abstract
Mean field approximation is a powerful technique to study the performance of large stochastic systems represented as $n$ interacting objects. Applications include load balancing models, epidemic spreading, cache replacement policies, or large-scale data centers. Mean field approximation is asymptotically exact for systems composed of $n$ homogeneous objects under mild conditions. In this paper, we study what happens when objects are heterogeneous. This can represent servers with different speeds or contents with different popularities. We define an interaction model that allows obtaining asymptotic convergence results for stochastic systems with heterogeneous object behavior, and show that the error of the mean field approximation is of order $O(1/n)$. More importantly, we show how to adapt the refined mean field approximation, developed by Gast et al. 2019, and show that the error of this approximation is reduced to $O(1/n^2)$. To illustrate the applicability of our result, we present two examples. The first addresses a list-based cache replacement model RANDOM($m$), which is an extension of the RANDOM policy. The second is a heterogeneous supermarket model. These examples show that the proposed approximations are computationally tractable and very accurate. They also show that for moderate system sizes ($n\approx30$) the refined mean field approximation tends to be more accurate than simulations for any reasonable simulation time., Comment: 42 pages
- Published
- 2021
- Full Text
- View/download PDF
41. Efficient designs for mean estimation in multilevel populations and test norming
- Author
-
Francesco Innocenti, van Breukelen, Gerard, Candel, Math, Tan, Frans, FHML Methodologie & Statistiek, and RS: CAPHRI - R6 - Promoting Health & Personalised Care
- Subjects
Optimal design ,statistical efficiency ,sampling ,Sampling (statistics) ,reference value ,sample size calculation ,Test (assessment) ,Mean estimation ,Efficiency ,Sample size determination ,Statistics ,survey ,Mathematics - Abstract
A crucial step in the research process is the choice of the design of the study because a poorly designed study can have serious consequences for science (e.g. biased or unreliable results) and society (e.g. a waste of resources or bad decisions in health and education based on invalid research conclusions). This thesis deals with the design of two types of studies: surveys for mean estimation in multilevel populations (e.g. estimation of average alcohol consumption by students grouped in schools), and normative studies for estimating reference values for psychological test scores and questionnaires (e.g. to measure patients’ symptoms). Both types of studies are of practical importance: results from surveys can help policymakers, and reference values are used by clinicians or educators to assess individuals. Thus, averages and reference values must be estimated with the highest possible precision, but without wasting resources (i.e. time and money). Hence, the main objective of this thesis is to provide guidelines for planning both types of studies to achieve precise estimates using minimum resources.
- Published
- 2021
42. Quantile regression-ratio-type estimators for mean estimation under complete and partial auxiliary information
- Author
-
Malik Muhammad Anas, Muhammad Hanif, Usman Shahzad, and Irsa Sajjad
- Subjects
Mean estimation ,Two phase sampling ,Mean squared error ,Ordinary least squares ,Statistics ,General Engineering ,Estimator ,Type (model theory) ,Regression ,Mathematics ,Quantile regression - Abstract
Traditional ordinary least square (OLS) regression is commonly utilized to develop regressionratio-type estimators with traditional measures of location. Abid et al. (2016b) extended this idea and developed regression-ratio-type estimators with traditional and non-traditional measures of location. In this article, the quantile regression with traditional and non-traditional measures of location is utilized and a class of ratio type mean estimators are proposed. The theoretical mean square error (MSE) expressions are also derived. The work is also extended for two phase sampling (partial information). The pertinence of the proposed and existing group of estimators is shown by considering real data collections originating from different sources. The discoveries are empowering and prevalent execution of the proposed group of estimators is witnessed and documented throughout the article.
- Published
- 2020
- Full Text
- View/download PDF
43. A Mathematical Model for A Cladding Fastener to Estimate the\ud Maximum Pull-Out Force Capacity
- Author
-
Ismail Abubakar, Roger O'Brien, Vahid Hassani, Hamid Ahmad Mehrabi, Zunaidi Ibrahim, and Adrian Morris
- Subjects
Austenite ,History ,business.product_category ,Computer science ,business.industry ,High strength steel ,Thread (computing) ,Structural engineering ,Cladding (fiber optics) ,Fastener ,Computer Science Applications ,Education ,Purlin ,Mean estimation ,sub_mechanicalengineering ,business ,Roof - Abstract
In the last few years, considerable attention has been paid to the roof cladding systems due to their progressive use in the construction of low-rise buildings. The design of such systems has been gaining importance since they are subjected to severe damage and failure caused by high wind events, particularly at their fastener connection points. To offer a solution for predicting the maximum pull-out force capacity of cladding fasteners, this article presents a mathematical model for a fastener made of high strength steel austenitic 316. In this model, the two basic parameters of the fastener, namely the thread depth and the thread angle are included as the main elements of the contact surface between threads and the low carbon mild steel batten/purlin sheets. This mathematical model will be proposed to estimate the maximum pull-out force capacity of the cladding fasteners made of cold-formed A2 316 stainless steel. After finding the parameters of the mathematical model by using an optimization method based on a genetic algorithm (GA), a comparison will be made between the mean estimation error of the new model and the formerly proposed ones.
- Published
- 2020
44. Improved loss estimation for a normal mean matrix
- Author
-
William E. Strawderman and Takeru Matsuda
- Subjects
Statistics and Probability ,Shrinkage estimator ,Estimation ,Numerical Analysis ,Estimator ,020206 networking & telecommunications ,02 engineering and technology ,01 natural sciences ,010104 statistics & probability ,Matrix (mathematics) ,Mean estimation ,Singular value ,Singular value decomposition ,0202 electrical engineering, electronic engineering, information engineering ,Mean vector ,Applied mathematics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
We investigate improved loss estimation in the matrix mean estimation problem. Specifically, for estimators of a normal mean matrix, we consider estimation of the Frobenius loss. Based on the singular values of the observation, we develop loss estimators that dominate the unbiased loss estimator for a broad class of matrix mean estimators including the Efron–Morris estimator. This is an extension of the results of Johnstone (1988) for a normal mean vector. We also provide improved estimators of loss for reduced-rank estimators. Numerical results show the effectiveness of the proposed loss estimators.
- Published
- 2019
- Full Text
- View/download PDF
45. Surrogate space based dimension reduction for nonignorable nonresponse
- Author
-
Jianqiu Deng, Xiaojie Yang, and Qihua Wang
- Subjects
Statistics and Probability ,Computer science ,Applied Mathematics ,Dimensionality reduction ,Structural dimension ,Sufficient dimension reduction ,Missing data ,Space (mathematics) ,Data set ,Computational Mathematics ,Mean estimation ,Computational Theory and Mathematics ,Algorithm ,Subspace topology - Abstract
Sufficient dimension reduction (SDR) for nonignorable nonresponse poses a challenge and the literature about this issue is very rare. In the nonignorable case, the SDR methods developed for ignorable missing data generally yield serious estimation bias and thus are invalid. A regression-calibration-based cumulative mean estimation (RC-CUME) procedure is proposed to recover the central subspace (CS) with the aid of a surrogate subspace. Asymptotic properties of the RC-CUME are investigated. A modified BIC-type criterion is used to determine the structural dimension of the CS. Some extensions to other SDR methods are presented. Simulation studies are conducted to access the finite-sample performance of the proposed RC-CUME approach, and a real data set is analyzed for illustration.
- Published
- 2022
- Full Text
- View/download PDF
46. Analyses of integrated aircraft cabin contaminant monitoring network based on Kalman consensus filter
- Author
-
Yanxiao Li, Zengqiang Chen, Hui Sun, and Rui Wang
- Subjects
0209 industrial biotechnology ,Engineering ,Consensus filter ,business.industry ,Applied Mathematics ,Node (networking) ,Stability (learning theory) ,02 engineering and technology ,Kalman filter ,Sensor fusion ,Computer Science Applications ,Mean estimation ,020901 industrial engineering & automation ,Computer Science::Systems and Control ,Control and Systems Engineering ,Control theory ,0202 electrical engineering, electronic engineering, information engineering ,Wireless ,Errors-in-variables models ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,business ,Instrumentation - Abstract
The modern civil aircrafts use air ventilation pressurized cabins subject to the limited space. In order to monitor multiple contaminants and overcome the hypersensitivity of the single sensor, the paper constructs an output correction integrated sensor configuration using sensors with different measurement theories after comparing to other two different configurations. This proposed configuration works as a node in the contaminant distributed wireless sensor monitoring network. The corresponding measurement error models of integrated sensors are also proposed by using the Kalman consensus filter to estimate states and conduct data fusion in order to regulate the single sensor measurement results. The paper develops the sufficient proof of the Kalman consensus filter stability when considering the system and the observation noises and compares the mean estimation and the mean consensus errors between Kalman consensus filter and local Kalman filter. The numerical example analyses show the effectiveness of the algorithm.
- Published
- 2017
- Full Text
- View/download PDF
47. A Bayesian analysis of classical shadows
- Author
-
Ryan S. Bennink, Joseph M. Lukens, and Kody J. H. Law
- Subjects
Statistical assumption ,Computer Networks and Communications ,Computer science ,QC1-999 ,Bayesian probability ,FOS: Physical sciences ,01 natural sciences ,010305 fluids & plasmas ,Through-the-lens metering ,symbols.namesake ,Mean estimation ,Quantum state ,0103 physical sciences ,Computer Science (miscellaneous) ,010306 general physics ,Quantum ,Quantum Physics ,Ground truth ,Physics ,Hilbert space ,Statistical and Nonlinear Physics ,QA75.5-76.95 ,Computational Theory and Mathematics ,Electronic computers. Computer science ,symbols ,Quantum Physics (quant-ph) ,Algorithm - Abstract
The method of classical shadows heralds unprecedented opportunities for quantum estimation with limited measurements [H.-Y. Huang, R. Kueng, and J. Preskill, Nat. Phys. 16, 1050 (2020)]. Yet its relationship to established quantum tomographic approaches, particularly those based on likelihood models, remains unclear. In this article, we investigate classical shadows through the lens of Bayesian mean estimation (BME). In direct tests on numerical data, BME is found to attain significantly lower error on average, but classical shadows prove remarkably more accurate in specific situations -- such as high-fidelity ground truth states -- which are improbable in a fully uniform Hilbert space. We then introduce an observable-oriented pseudo-likelihood that successfully emulates the dimension-independence and state-specific optimality of classical shadows, but within a Bayesian framework that ensures only physical states. Our research reveals how classical shadows effect important departures from conventional thinking in quantum state estimation, as well as the utility of Bayesian methods for uncovering and formalizing statistical assumptions., Comment: 8 pages, 5 figures
- Published
- 2020
- Full Text
- View/download PDF
48. Sub-Gaussian estimators of the mean of a random vector
- Author
-
Shahar Mendelson and Gábor Lugosi
- Subjects
Statistics and Probability ,Independent and identically distributed random variables ,Multivariate statistics ,Mean estimation ,Multivariate random variable ,Gaussian ,Second moment of area ,Mathematics - Statistics Theory ,Sample (statistics) ,01 natural sciences ,010104 statistics & probability ,symbols.namesake ,Statistics - Machine Learning ,62G08 ,Statistics ,Applied mathematics ,62J02 ,60G25 ,0101 mathematics ,Mathematics ,Estimator ,robust estimation ,sub-Gaussian inequalities ,symbols ,Statistics, Probability and Uncertainty ,Invariant estimator - Abstract
We study the problem of estimating the mean of a random vector $X$ given a sample of $N$ independent, identically distributed points. We introduce a new estimator that achieves a purely sub-Gaussian performance under the only condition that the second moment of $X$ exists. The estimator is based on a novel concept of a multivariate median., Comment: 12 pages
- Published
- 2019
49. Mean Estimation under Imputation based on Two-Phase Sampling Design using an Auxiliary Variable
- Author
-
Ranjita Pandey and Kalpana Yadav
- Subjects
Statistics and Probability ,Two phase sampling ,Bias, Mean squared error, Two-phase sampling scheme, Relative efficiency ,Mean squared error ,Population mean ,lcsh:Mathematics ,05 social sciences ,Estimator ,Management Science and Operations Research ,lcsh:QA1-939 ,01 natural sciences ,Auxiliary variables ,010104 statistics & probability ,Mean estimation ,Efficiency ,Modeling and Simulation ,0502 economics and business ,Statistics ,050211 marketing ,Imputation (statistics) ,0101 mathematics ,Statistics, Probability and Uncertainty ,lcsh:Statistics ,lcsh:HA1-4737 ,Mathematics - Abstract
The present article offers more efficient imputation based estimators of the population mean under the framework of two-phase sampling in presence of an auxiliary variable. The theoretical conditions stating superiority of the proposed estimators, over some prevalent existing competitive estimators, in terms of relative efficiency is established by numerical illustrations based on three different data sets from the classical statistical literature.
- Published
- 2016
50. Comparison of centralised scaled unscented Kalman filter and extended Kalman filter for multisensor data fusion architectures
- Author
-
Zirui Xing and Yuanqing Xia
- Subjects
Analysis of covariance ,0209 industrial biotechnology ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Kalman filter ,Covariance ,computer.software_genre ,Sensor fusion ,Running time ,Mean estimation ,Extended Kalman filter ,020901 industrial engineering & automation ,Signal Processing ,Data fusion algorithms ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Electrical and Electronic Engineering ,Algorithm ,computer - Abstract
This study presents three non-linear centralised scaled unscented Kalman filter (SUKF) for multisensor data fusion algorithms, which are augmented measurements, measurements weighted and sequential filtering fusion. First, the accuracy analysis of extended Kalman filter (EKF) and SUKF is investigated in detail. Second, through comparing the error covariance traces and the absolute mean estimation errors of X and Y directions of centralised SUKF for multisensor data fusion algorithms with that of centralised EKF for multisensor data fusion algorithms, it can be remarked that the performance of centralised augmented measurements SUKF for multisensor data fusion algorithm is the best one among the six algorithms, which is to say that Algorithm (Iu) shows the best performance in accuracy. Finally, combining and synthetically analysing the running time of six algorithms, it illustrates that Algorithm (Iu) is optimal in comprehensive aspects among six algorithms.
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.