Start Over

Adaptive Regularized Warped Gradient Descent Enhances Model Generalization and Meta-learning for Few-shot Learning.

Authors :: Rao, Shuzhen
Huang, Jun
Tang, Zengming
Source :: Neurocomputing. Jun2023, Vol. 537, p271-281. 11p.
Publication Year :: 2023
Abstract: Warped Gradient descent (WarpGrad) is a remarkable meta-learning method for gradient transformation by inserting warp-layers. However, the task-shared initialization provided by WarpGrad is difficult to be adaptive to each task. Moreover, transforming gradients with meta-learned warp-layers ignores the local geometric features or task-specific knowledge, and may lead to a significant risk of overfitting caused by the increase of parameters. In this paper, we propose ARWarpGrad to guarantee better generalization performance with faster convergence speed by modeling both the cross-task and task-specific knowledge. We introduce Initialization Modulation (IM) to meta-learn to initialize the task-learner specifically. Furthermore, the Mixed Gradient Preprocessing (MGP), which includes the Adaptive Learning Rates (ALR) and the Gaussian Momentum Dropout (GMD), is put forward to provide better adaptive optimization direction and length for task adaptation based on the feature of local geometries. In addition, Memory Regularization (MR) is provided to alleviate the overfitting problem effectively with the use of parameter memory. Ultimately, extensive experiments on three settings demonstrate that ARWarpGrad achieves state-of-the-art performance with convergence acceleration and overfitting prevention characteristics. [ABSTRACT FROM AUTHOR]

Subjects :: *MACHINE learning

Details

Language :: English
ISSN :: 09252312
Volume :: 537
Database :: Academic Search Index
Journal :: Neurocomputing
Publication Type :: Academic Journal
Accession number :: 163185738
Full Text :: https://doi.org/10.1016/j.neucom.2023.03.042

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Adaptive Regularized Warped Gradient Descent Enhances Model Generalization and Meta-learning for Few-shot Learning.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Adaptive Regularized Warped Gradient Descent Enhances Model Generalization and Meta-learning for Few-shot Learning.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources