1. Efficient Adaptive Online Learning via Frequent Directions
- Author
-
Yuanyu Wan and Lijun Zhang
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Outer product ,Approximation algorithm ,Space (mathematics) ,Education, Distance ,Matrix (mathematics) ,Computational Theory and Mathematics ,Artificial Intelligence ,Symmetric matrix ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Time complexity ,Subgradient method ,Algorithm ,Algorithms ,Software ,Curse of dimensionality - Abstract
By employing time-varying proximal functions, adaptive subgradient methods (ADAGRAD) have improved the regret bound and been widely used in online learning and optimization. However, ADAGRAD with full matrix proximal functions (ADA-FULL) cannot handle large-scale problems due to the impractical $O(d^3)$ time and $O(d^2)$ space complexities, though it has better performance when gradients are correlated. In this paper, we propose two efficient variants of ADA-FULL via a matrix sketching technique called frequent directions (FD). The first variant named as ADA-FD directly utilizes FD to maintain and manipulate low-rank matrices, which reduces the space and time complexities to $O(\tau d)$ and $O(\tau^2d)$ respectively, where d is the dimensionality and $\tau\ll d$ is the sketching size. The second variant named as ADA-FFD further adopts a doubling trick to accelerate FD used in ADA-FD, which reduces the average time complexity to $O(\tau d)$ while only doubles the space complexity of ADA-FD. Theoretical analysis reveals that the regret of ADA-FD and ADA-FFD is close to that of ADA-FULL as long as the outer product matrix of gradients is approximately low-rank. Experimental results demonstrate the efficiency and effectiveness of our algorithms. more...
- Published
- 2022
- Full Text
- View/download PDF