Start Over

Scalable and Memory-Efficient Kernel Ridge Regression

Authors :: Xiaoye S. Li
Gustavo Chávez
Yang Liu
Pieter Ghysels
Elizaveta Rebrova
Source :: IPDPS
Publication Year :: 2020
Publisher :: IEEE, 2020.
Abstract: We present a scalable and memory-efficient framework for kernel ridge regression. We exploit the inherent rank deficiency of the kernel ridge regression matrix by constructing an approximation that relies on a hierarchy of low-rank factorizations of tunable accuracy, rather than leverage scores or other subsampling techniques. Without ever decompressing the kernel matrix approximation, we propose factorization and solve methods to compute the weight(s) for a given set of training and test data. We show that our method performs an optimal number of operations $\mathcal{O}\left( {{r^2}n} \right)$ with respect to the number of training samples (n) due to the underlying numerical low-rank (r) structure of the kernel matrix. Furthermore, each algorithm is also presented in the context of a massively parallel computer system, exploiting two levels of concurrency that take into account both shared-memory and distributed-memory inter-node parallelism. In addition, we present a variety of experiments using popular datasets – small, and large – to show that our approach provides sufficient accuracy in comparison with state-of-the-art methods and with the exact (i.e. non-approximated) kernel ridge regression method. For datasets, in the order of 106 data points, we show that our framework strong-scales to 103 cores. Finally, we provide a Python interface to the scikit-learn library so that scikit-learn can leverage our high-performance solver library to achieve much-improved performance and memory footprint.

Subjects :: Kernel (linear algebra)
Matrix (mathematics)
Data point
Factorization
Computer science
Memory footprint
Leverage (statistics)
Solver
Massively parallel
Algorithm
Test data

Details

Database :: OpenAIRE
Journal :: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Accession number :: edsair.doi...........9bc4c574bce4d01b1cfe955199118b84

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Scalable and Memory-Efficient Kernel Ridge Regression

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Scalable and Memory-Efficient Kernel Ridge Regression

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources