Start Over

Imitating the oracle: Towards calibrated model for class incremental learning.

Authors :: Zhu, Fei
Cheng, Zhen
Zhang, Xu-Yao
Liu, Cheng-Lin
Source :: Neural Networks. Jul2023, Vol. 164, p38-48. 11p.
Publication Year :: 2023
Abstract: Class-incremental learning (CIL) aims to recognize classes that emerged in different phases. The joint-training (JT), which trains the model jointly with all classes, is often considered as the upper bound of CIL. In this paper, we thoroughly analyze the difference between CIL and JT in feature space and weight space. Motivated by the comparative analysis, we propose two types of calibration: feature calibration and weight calibration to imitate the oracle (ItO), i.e., JT. Specifically, on the one hand, feature calibration introduces deviation compensation to maintain the class decision boundary of old classes in feature space. On the other hand, weight calibration leverages forgetting-aware weight perturbation to increase transferability and reduce forgetting in parameter space. With those two calibration strategies, the model is forced to imitate the properties of joint-training at each incremental learning stage, thus yielding better CIL performance. Our ItO is a plug-and-play method and can be implemented into existing methods easily. Extensive experiments on several benchmark datasets demonstrate that ItO can significantly and consistently improve the performance of existing state-of-the-art methods. Our code is publicly available at https://github.com/Impression2805/ItO4CIL. • We explore and study how class incremental learning (CIL) differs from joint training (i.e., the oracle), and identify the crucial difference in both feature space and weight space. Therefore, we propose to improve CIL by imitating the oracle (ItO). • In the feature space, the proposed feature calibration introduces deviation compensation to maintain the class decision boundary of old classes for CIL. • In the weights space, the proposed weight calibration leverages forgetting-aware weight perturbation to increase transferability and reduce forgetting for CIL. • Extensive experiments demonstrate that our ItO can significantly and consistently improve the performance of existing state-of-the-art methods. [ABSTRACT FROM AUTHOR]