Back to Search Start Over

Theoretical optimization of group size in group normalization for enhanced deep neural network training.

Authors :
Babu, Bachina Harish
Vinothini, V. R.
Solaimalai, Gautam
Vanitha, R.
Yadav, Ajay Singh
Sukania, P.
Vijayan, V.
Srinivasan, R.
Source :
AIP Conference Proceedings; 2024, Vol. 3193 Issue 1, p1-12, 12p
Publication Year :
2024

Abstract

Recently, numerous normalization layers for the purpose of stabilizing the training of deep neural networks (DNN) have been developed. Group normalization is one such technique that expands upon instance normalization and layer normalization by permitting some flexibility about the quantity of groups it employs. However, in order to ascertain the most effective number of groups, it is necessary to conduct time-consuming studies involving trial-and-error hyperparameter adjustment. For this study, we lay out a method that is both practical and effective for deciding how many groups to use. The initial observation is that the group normalization layer's gradient behavior is affected by the number of groups. By deducing the optimal group size, we may calibrate the gradient scale for optimization using gradient descent. For the first time, this research suggests a maximum group size that accounts for theoretical underpinnings, architectural concerns, and the capacity to independently produce adequate value for each layer. All sorts of neural network topologies, tasks, and datasets showed that the proposal method outperformed the state-of-the-art procedures. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
ARTIFICIAL neural networks

Details

Language :
English
ISSN :
0094243X
Volume :
3193
Issue :
1
Database :
Complementary Index
Journal :
AIP Conference Proceedings
Publication Type :
Conference
Accession number :
180847032
Full Text :
https://doi.org/10.1063/5.0232854