Towards Compute-Optimal Transfer Learning

Authors :: Caccia, Massimo
Galashov, Alexandre
Douillard, Arthur
Rannen-Triki, Amal
Rao, Dushyant
Paganini, Michela
Charlin, Laurent
Ranzato, Marc'Aurelio
Pascanu, Razvan
Publication Year :: 2023
Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.