Start Over

Adaptive Distributed Parallel Training Method for a Deep Learning Model Based on Dynamic Critical Paths of DAG.

Authors :: Zeng, Yan
Wang, Wei
Ding, Yong
Zhang, Jilin
Ren, Yongjian
Yi, Guangzheng
Source :: Mathematics (2227-7390). Dec2022, Vol. 10 Issue 24, p4788. 21p.
Publication Year :: 2022
Abstract: AI provides a new method for massive simulated data calculations in molecular dynamics, materials, and other scientific computing fields. However, the complex structures and large-scale parameters of neural network models make them difficult to develop and train. The automatic parallel technology based on graph algorithms is one of the most promising methods to solve this problem, despite the low efficiency in the design, implementation, and execution of distributed parallel policies for large-scale neural network models. In this paper, we propose an adaptive distributed parallel training method based on the dynamic generation of critical DAG (directed acyclic graph) paths, called FD-DPS, to solve this efficiency problem. Firstly, the proposed model splits operators with the dimension of the tensor, which can expand the space available for model parallelism. Secondly, a dynamic critical path generation method is employed to determine node priority changes in the DAG of the neural network models. Finally, the model implements the optimal scheduling of critical paths based on the priority of the nodes, thereby improving the performance of parallel strategies. Our experiments show that FD-DPS can achieve 12.76% and 11.78% faster training on PnasNet_mobile and ResNet_200 models, respectively, compared with the MP-DPS and Fast methods. [ABSTRACT FROM AUTHOR]

Subjects :: *DEEP learning
*ARTIFICIAL neural networks
*CRITICAL path analysis
*GRAPH algorithms
*SCIENTIFIC computing
*DYNAMIC models

Details

Language :: English
ISSN :: 22277390
Volume :: 10
Issue :: 24
Database :: Academic Search Index
Journal :: Mathematics (2227-7390)
Publication Type :: Academic Journal
Accession number :: 161003123
Full Text :: https://doi.org/10.3390/math10244788

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Adaptive Distributed Parallel Training Method for a Deep Learning Model Based on Dynamic Critical Paths of DAG.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Adaptive Distributed Parallel Training Method for a Deep Learning Model Based on Dynamic Critical Paths of DAG.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources