Author: "Liu, Shuo" / Publication Year Range: Last 3 years / Search Limiters: Full Text / Topic: 3 selected - Searchworks@Jio Institute Digital Library Search Results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

1. Scalable Fully Pipelined Hardware Architecture for In-Network Aggregated AllReduce Communication.

Author: Liu, Yao, Zhang, Junyi, Liu, Shuo, Wang, Qiaoling, Dai, Wangchen, and Cheung, Ray Chak Chung
Subjects: MACHINE learning, PEER-to-peer architecture (Computer networks), HARDWARE, BANDWIDTHS, TASK analysis
Abstract: The Ring-AllReduce framework is currently the most popular solution to deploy industry-level distributed machine learning tasks. However, only about half of the maximum bandwidth can be achieved in the optimal condition. In recent years, several in-network aggregation frameworks have been proposed to overcome the drawback, but limited hardware information have been disclosed. In this paper, we propose a scalable fully-pipelined architecture that handles tasks like forwarding, aggregation and retransmission with no bandwidth loss. The architecture is implemented on a Xilinx Ultrascale FPGA that connects to 8 working servers with 10 Gb/s network adapters, and it is able to scale to more complicated scenarios involving more workers. Compared with Ring-AllReduce, using AllReduce-Switch improves the efficient bandwidth of AllReduce communication with a ratio of $1.75\times $. In image training tasks, the proposed hardware architecture helps to achieve up to $1.67\times $ speedup to the training process. For computing-intensive models, the speedup from communication may be partially hidden by computing. In particular, for ResNet-50, AllReduce-Switch improves the training process with MPI and NCCL by $1.30\times $ and $1.04\times $ respectively. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Liu, Shuo"'

1. Scalable Fully Pipelined Hardware Architecture for In-Network Aggregated AllReduce Communication.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Database

1 results on '"Liu, Shuo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources