Author: "TROPP, JOEL A." / Topic: statistics - machine learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"TROPP, JOEL A."' showing total 12 results

Start Over Author "TROPP, JOEL A." Topic statistics - machine learning

12 results on '"TROPP, JOEL A."'

1. Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

Author: Epperly, Ethan N., Tropp, Joel A., and Webber, Robert J.
Subjects: Mathematics - Numerical Analysis, Statistics - Computation, Statistics - Machine Learning, 65F55, 65C99, 68T05
Abstract: Randomly pivoted Cholesky (RPCholesky) is an algorithm for constructing a low-rank approximation of a positive-semidefinite matrix using a small number of columns. This paper develops an accelerated version of RPCholesky that employs block matrix computations and rejection sampling to efficiently simulate the execution of the original algorithm. For the task of approximating a kernel matrix, the accelerated algorithm can run over $40\times$ faster. The paper contains implementation details, theoretical guarantees, experiments on benchmark data sets, and an application to computational chemistry., Comment: 26 pages, 4 figures, 5 pages of supplementary material
Published: 2024

2. Robust, randomized preconditioning for kernel ridge regression

Author: Díaz, Mateo, Epperly, Ethan N., Frangella, Zachary, Tropp, Joel A., and Webber, Robert J.
Subjects: Mathematics - Numerical Analysis, Statistics - Machine Learning, 68W20, 65F10, 65F55
Abstract: This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k \ll N$ selected data centers at a cost of $O((N + k^2) k \log k)$ operations. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications., Comment: 29 pages, 11 figures
Published: 2023

3. Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

Author: Chen, Yifan, Epperly, Ethan N., Tropp, Joel A., and Webber, Robert J.
Subjects: Mathematics - Numerical Analysis, Statistics - Machine Learning, 65F55, 65C99, 68T05
Abstract: The randomly pivoted partial Cholesky algorithm (RPCholesky) computes a factorized rank-k approximation of an N x N positive-semidefinite (psd) matrix. RPCholesky requires only (k + 1) N entry evaluations and O(k^2 N) additional arithmetic operations, and it can be implemented with just a few lines of code. The method is particularly useful for approximating a kernel matrix. This paper offers a thorough new investigation of the empirical and theoretical behavior of this fundamental algorithm. For matrix approximation problems that arise in scientific machine learning, experiments show that RPCholesky matches or beats the performance of alternative algorithms. Moreover, RPCholesky provably returns low-rank approximations that are nearly optimal. The simplicity, effectiveness, and robustness of RPCholesky strongly support its use in scientific computing and machine learning applications., Comment: 40 pages, 4 figures
Published: 2022

4. Efficient error and variance estimation for randomized matrix computations

Author: Epperly, Ethan N. and Tropp, Joel A.
Subjects: Mathematics - Numerical Analysis, Statistics - Machine Learning, 62F40, 65F55, 68W20
Abstract: Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix computation. Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and randomized Nystr\"om approximation, and they provide useful information that can be used to assess the quality of the computed output and guide algorithmic parameter choices., Comment: 22 pages, 10 figures, 13 pages of supplementary material. v5: added derivation of leave-one-out error estimate to supplementary material
Published: 2022
Full Text: View/download PDF

5. Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data

Author: Tropp, Joel A., Yurtsever, Alp, Udell, Madeleine, and Cevher, Volkan
Subjects: Computer Science - Numerical Analysis, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning
Abstract: Several important applications, such as streaming PCA and semidefinite programming, involve a large-scale positive-semidefinite (psd) matrix that is presented as a sequence of linear updates. Because of storage limitations, it may only be possible to retain a sketch of the psd matrix. This paper develops a new algorithm for fixed-rank psd approximation from a sketch. The approach combines the Nystrom approximation with a novel mechanism for rank truncation. Theoretical analysis establishes that the proposed method can achieve any prescribed relative error in the Schatten 1-norm and that it exploits the spectral decay of the input matrix. Computer experiments show that the proposed method dominates alternative techniques for fixed-rank psd matrix approximation across a wide range of examples.
Published: 2017

6. Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

Author: Yurtsever, Alp, Udell, Madeleine, Tropp, Joel A., and Cevher, Volkan
Subjects: Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: This paper concerns a fundamental class of convex matrix optimization problems. It presents the first algorithm that uses optimal storage and provably computes a low-rank approximation of a solution. In particular, when all solutions have low rank, the algorithm converges to a solution. This algorithm, SketchyCGM, modifies a standard convex optimization scheme, the conditional gradient method, to store only a small randomized sketch of the matrix variable. After the optimization terminates, the algorithm extracts a low-rank approximation of the solution from the sketch. In contrast to nonconvex heuristics, the guarantees for SketchyCGM do not rely on statistical models for the problem data. Numerical work demonstrates the benefits of SketchyCGM over heuristics.
Published: 2017

7. Practical sketching algorithms for low-rank matrix approximation

Author: Tropp, Joel A., Yurtsever, Alp, Udell, Madeleine, and Cevher, Volkan
Subjects: Computer Science - Numerical Analysis, Computer Science - Data Structures and Algorithms, Mathematics - Numerical Analysis, Statistics - Computation, Statistics - Machine Learning, Primary 65F30, Secondary 68W20
Abstract: This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.
Published: 2016
Full Text: View/download PDF

8. Universality laws for randomized dimension reduction, with applications

Author: Oymak, Samet and Tropp, Joel A.
Subjects: Mathematics - Probability, Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Mathematics - Statistics Theory, Statistics - Machine Learning, 60D05, 60F17 (Primary) 60B20 (Secondary)
Abstract: Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set. The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability. This paper studies a natural family of randomized dimension reduction maps and a large class of data sets. It proves that there is a phase transition in the success probability of the dimension reduction map as the embedding dimension increases. For a given data set, the location of the phase transition is the same for all maps in this family. Furthermore, each map has the same stability properties, as quantified through the restricted minimum singular value. These results can be viewed as new universality laws in high-dimensional stochastic geometry. Universality laws for randomized dimension reduction have many applications in applied mathematics, signal processing, and statistics. They yield design principles for numerical linear algebra algorithms, for compressed sensing measurement ensembles, and for random linear codes. Furthermore, these results have implications for the performance of statistical estimation methods under a large class of random experimental designs., Comment: v2 and v3 with technical corrections. Code for reproducing figures available at http://users.cms.caltech.edu/~jtropp/
Published: 2015

9. An Introduction to Matrix Concentration Inequalities

Author: Tropp, Joel A.
Subjects: Mathematics - Probability, Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Computer Science - Numerical Analysis, Statistics - Machine Learning, Primary: 60B20. Secondary: 60F10, 60G50, 60G42
Abstract: In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic. The aim of this monograph is to describe the most successful methods from this area along with some interesting examples that these techniques can illuminate., Comment: 163 pages. To appear in Foundations and Trends in Machine Learning
Published: 2015

10. Factoring nonnegative matrices with linear programs

Author: Bittorf, Victor, Recht, Benjamin, Re, Christopher, and Tropp, Joel A.
Subjects: Mathematics - Optimization and Control, Computer Science - Learning, Statistics - Machine Learning
Abstract: This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes., Comment: 17 pages, 10 figures. Modified theorem statement for robust recovery conditions. Revised proof techniques to make arguments more elementary. Results on robustness when rows are duplicated have been superseded by arxiv.org/1211.6687
Published: 2012

11. Robust computation of linear models by convex relaxation

Author: Lerman, Gilad, McCoy, Michael, Tropp, Joel A., and Zhang, Teng
Subjects: Computer Science - Information Theory, Statistics - Computation, Statistics - Machine Learning, 62H25, 65K05, 90C22
Abstract: Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relaxation of the set of orthogonal projectors to reach the convex formulation. The paper provides an efficient algorithm for solving the REAPER problem, and it documents numerical experiments which confirm that REAPER can dependably find linear structure in synthetic and natural data. In addition, when the inliers lie near a low-dimensional subspace, there is a rigorous theory that describes when REAPER can approximate this subspace., Comment: Formerly titled "Robust computation of linear models, or How to find a needle in a haystack"
Published: 2012
Full Text: View/download PDF

12. Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

Author: Yurtsever, Alp, Udell, Madeleine, Tropp, Joel A., and Cevher, Volkan
Subjects: FOS: Computer and information sciences, ml-ai, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control
Abstract: This paper considers a fundamental class of convex matrix optimization problems with low-rank solutions. We show it is possible to solve these problem using far less memory than the natural size of the decision variable when the problem data has a concise representation. Our proposed method, SketchyCGM, is the first algorithm to offer provable convergence to an optimal point with an optimal memory footprint. SketchyCGM modifies a standard convex optimization method - the conditional gradient method - to work on a sketched version of the decision variable, and can recover the solution from this sketch. In contrast to recent work on non-convex methods for this problem class, SketchyCGM is a convex method; and our convergence guarantees do not rely on statistical assumptions.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"TROPP, JOEL A."'

1. Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

2. Robust, randomized preconditioning for kernel ridge regression

3. Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

4. Efficient error and variance estimation for randomized matrix computations

5. Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data

6. Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

7. Practical sketching algorithms for low-rank matrix approximation

8. Universality laws for randomized dimension reduction, with applications

9. An Introduction to Matrix Concentration Inequalities

10. Factoring nonnegative matrices with linear programs

11. Robust computation of linear models by convex relaxation

12. Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

12 results on '"TROPP, JOEL A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources