Start Over

Lifelong language learning with adaptive uncertainty regularization.

Authors :: Zhang, Lei
Wang, Shupeng
Yuan, Fajie
Geng, Binzong
Yang, Min
Source :: Information Sciences. Apr2023, Vol. 622, p794-807. 14p.
Publication Year :: 2023
Abstract: It has been a long-standing goal in natural language processing (NLP) to learn a general linguistic intelligence model that can perform well on many different NLP tasks continually evolving over time while avoiding revisiting all previous data at each stage. Most existing deep neural networks suffer from catastrophic forgetting when dealing with sequential tasks in an incremental way, leading to dramatic performance degradation due to the missing training data of old tasks. In this paper, we propose a Lifelong language method with Adaptive Uncertainty Regularization (LAUR), which can adapt a single BERT model to work with continuously arriving text examples from different NLP tasks. Specifically, LAUR is built on the Bayesian online learning framework, and three uncertainty regularization terms are devised to collaboratively control the parameters so as to resolve the stability-plasticity dilemma in lifelong language learning. The previous posterior constrain parameters that strongly determine the output results, preventing these parameters from changing drastically, while other parameters are encouraged to be updated over time. In addition, we propose a task-specific residual adaptation module in parallel to each layer of BERT to endow LAUR with the capacity to learn better task-specific knowledge. This configuration makes LAUR less prone to losing the knowledge stored in the base BERT network when learning a new task. Experimental results show that LAUR outperforms state-of-the-art lifelong learning models on a variety of NLP tasks. For reproducibility, we submit the code and data at: https://github.com/kiujhytgtrfd2021/LAUR. [ABSTRACT FROM AUTHOR]