1. SPARQ-SGD: Event-Triggered and Compressed Communication in Decentralized Optimization
- Author
-
Jemin George, Deepesh Data, Navjot Singh, and Suhas Diggavi
- Subjects
Discrete mathematics ,0209 industrial biotechnology ,Computer science ,Stochastic process ,Node (networking) ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Graph ,Computer Science Applications ,Uncompressed video ,020901 industrial engineering & automation ,Rate of convergence ,Control and Systems Engineering ,Compression (functional analysis) ,Graph (abstract data type) ,Electrical and Electronic Engineering ,0105 earth and related environmental sciences ,Event (probability theory) - Abstract
In this paper, we propose and analyze SPARQ-SGD, an event-triggered and compressed algorithm for decentralized training of large-scale machine learning models over a graph. Each node can locally compute a condition (event) which triggers a communication where quantized and sparsified local model parameters are sent. In SPARQ-SGD, each node first takes a fixed number of local gradient steps and then checks if the model parameters have significantly changed compared to its last update; it communicates further compressed model parameters only when there is a significant change, as specified by a (design) criterion. We prove that SPARQ-SGD converges as $O\left( {\frac{1}{{nT}}} \right)$ and $O\left( {\frac{1}{{\sqrt {nT} }}} \right)$ in the strongly-convex and non-convex settings, respectively, demonstrating that aggressive compression, including event-triggered communication, model sparsification and quantization does not affect the overall convergence rate compared to uncompressed decentralized training; thereby theoretically yielding communication efficiency for ‘free’. We evaluate SPARQ-SGD over real datasets to demonstrate significant savings in communication bits over the state-of-the-art.
- Published
- 2023
- Full Text
- View/download PDF