Author: "Samworth, Richard J [0000-0003-2426-4679]" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Samworth, Richard J [0000-0003-2426-4679]"' showing total 2 results

Start Over Author "Samworth, Richard J [0000-0003-2426-4679]" Language undetermined

2 results on '"Samworth, Richard J [0000-0003-2426-4679]"'

1. USP: an independence test that improves on Pearson's chi-squared and the $G$-test

Author: Berrett, Thomas B, Samworth, Richard, Samworth, Richard J. [0000-0003-2426-4679], Apollo - University of Cambridge Repository, Samworth, Richard [0000-0003-2426-4679], and Samworth, Richard J [0000-0003-2426-4679]
Subjects: FOS: Computer and information sciences, statistic, General Mathematics, independence, General Physics and Astronomy, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Fisher��s exact test, Statistics - Applications, 01 natural sciences, Fisher’s exact test, G-test, Methodology (stat.ME), 010104 statistics & probability, Statistics - Machine Learning, Research articles, 62H17, 62H20, 62F03, 62F05, 62E20, 0502 economics and business, FOS: Mathematics, stat.TH, Applications (stat.AP), 0101 mathematics, stat.AP, Statistics - Methodology, 050205 econometrics, Pearson��s ��2-test, 05 social sciences, General Engineering, Pearson’s χ 2 -test, math.ST, stat.ML, Pearson’s χ2-test, stat.ME, permutation test
Abstract: We present the $U$-Statistic Permutation (USP) test of independence in the context of discrete data displayed in a contingency table. Either Pearson's chi-squared test of independence, or the $G$-test, are typically used for this task, but we argue that these tests have serious deficiencies, both in terms of their inability to control the size of the test, and their power properties. By contrast, the USP test is guaranteed to control the size of the test at the nominal level for all sample sizes, has no issues with small (or zero) cell counts, and is able to detect distributions that violate independence in only a minimal way. The test statistic is derived from a $U$-statistic estimator of a natural population measure of dependence, and we prove that this is the unique minimum variance unbiased estimator of this population quantity. The practical utility of the USP test is demonstrated on both simulated data, where its power can be dramatically greater than those of Pearson's test and the $G$-test, and on real data. The USP test is implemented in the R package USP., Comment: 27 pages, 7 figures
Published: 2021
Full Text: View/download PDF

2. High-dimensional principal component analysis with heterogeneous missingness

Author: Zhu, Ziwei, Wang, Tengyao, Samworth, Richard J, Zhu, Ziwei [0000-0001-9536-0575], Wang, Tengyao [0000-0003-2072-6645], Samworth, Richard J [0000-0003-2426-4679], Apollo - University of Cambridge Repository, and Samworth, Richard [0000-0003-2426-4679]
Subjects: FOS: Computer and information sciences, Statistics and Probability, heterogeneous missingness, principal component analysis, iterative projections, Mathematics - Statistics Theory, Statistics Theory (math.ST), math.ST, Methodology (stat.ME), missing data, stat.ME, FOS: Mathematics, stat.TH, 62H25, HA Statistics, Statistics, Probability and Uncertainty, high‐dimensional statistics, Statistics - Methodology
Abstract: We study the problem of high-dimensional Principal Component Analysis (PCA) with missing observations. In simple, homogeneous missingness settings with a noise level of constant order, we show that an existing inverse-probability weighted (IPW) estimator of the leading principal components can (nearly) attain the minimax optimal rate of convergence. However, deeper investigation reveals both that, particularly in more realistic settings where the missingness mechanism is heterogeneous, the empirical performance of the IPW estimator can be unsatisfactory, and moreover that, in the noiseless case, it fails to provide exact recovery of the principal components. Our main contribution, then, is to introduce a new method for high-dimensional PCA, called `primePCA', that is designed to cope with situations where observations may be missing in a heterogeneous manner. Starting from the IPW estimator, primePCA iteratively projects the observed entries of the data matrix onto the column space of our current estimate to impute the missing entries, and then updates our estimate by computing the leading right singular space of the imputed data matrix. It turns out that the interaction between the heterogeneity of missingness and the low-dimensional structure is crucial in determining the feasibility of the problem. We therefore introduce an incoherence condition on the principal components and prove that in the noiseless case, the error of primePCA converges to zero at a geometric rate when the signal strength is not too small. An important feature of our theoretical guarantees is that they depend on average, as opposed to worst-case, properties of the missingness mechanism. Our numerical studies on both simulated and real data reveal that primePCA exhibits very encouraging performance across a wide range of scenarios., 42 pages, 4 figures
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"Samworth, Richard J [0000-0003-2426-4679]"'

1. USP: an independence test that improves on Pearson's chi-squared and the $G$-test

2. High-dimensional principal component analysis with heterogeneous missingness

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Database

Publisher

2 results on '"Samworth, Richard J [0000-0003-2426-4679]"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources