Author: "Rannen-Triki, Amal" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Rannen-Triki, Amal"' showing total 3 results

Start Over Author "Rannen-Triki, Amal" Language undetermined

3 results on '"Rannen-Triki, Amal"'

1. Towards Compute-Optimal Transfer Learning

Author: Caccia, Massimo, Galashov, Alexandre, Douillard, Arthur, Rannen-Triki, Amal, Rao, Dushyant, Paganini, Michela, Charlin, Laurent, Ranzato, Marc'Aurelio, and Pascanu, Razvan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.
Published: 2023
Full Text: View/download PDF

2. Kalman Filter for Online Classification of Non-Stationary Data

Author: Titsias, Michalis K., Galashov, Alexandre, Rannen-Triki, Amal, Pascanu, Razvan, Teh, Yee Whye, and Bornschein, Jorg
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary structure of the data, and with quantification of predictive uncertainty. Motivated by these challenges we introduce a probabilistic Bayesian online learning model by using a (possibly pretrained) neural representation and a state space model over the linear predictor weights. Non-stationarity over the linear predictor weights is modelled using a parameter drift transition density, parametrized by a coefficient that quantifies forgetting. Inference in the model is implemented with efficient Kalman filter recursions which track the posterior distribution over the linear weights, while online SGD updates over the transition dynamics coefficient allows to adapt to the non-stationarity seen in data. While the framework is developed assuming a linear Gaussian model, we also extend it to deal with classification problems and for fine-tuning the deep learning representation. In a set of experiments in multi-class classification using data sets such as CIFAR-100 and CLOC we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
Published: 2023
Full Text: View/download PDF

3. Towards Robust and Efficient Continual Language Learning

Author: Fisch, Adam, Rannen-Triki, Amal, Pascanu, Razvan, Bornschein, Jörg, Lazaridou, Angeliki, Gribovskaya, Elena, and Ranzato, Marc'Aurelio
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: As the application space of language models continues to evolve, a natural question to ask is how we can quickly adapt models to new tasks. We approach this classic question from a continual learning perspective, in which we aim to continue fine-tuning models trained on past tasks on new tasks, with the goal of "transferring" relevant knowledge. However, this strategy also runs the risk of doing more harm than good, i.e., negative transfer. In this paper, we construct a new benchmark of task sequences that target different possible transfer scenarios one might face, such as a sequence of tasks with high potential of positive transfer, high potential for negative transfer, no expected effect, or a mixture of each. An ideal learner should be able to maximally exploit information from all tasks that have any potential for positive transfer, while also avoiding the negative effects of any distracting tasks that may confuse it. We then propose a simple, yet effective, learner that satisfies many of our desiderata simply by leveraging a selective strategy for initializing new models from past task checkpoints. Still, limitations remain, and we hope this benchmark can help the community to further build and analyze such learners.
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Rannen-Triki, Amal"'

1. Towards Compute-Optimal Transfer Learning

2. Kalman Filter for Online Classification of Non-Stationary Data

3. Towards Robust and Efficient Continual Language Learning

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Database

3 results on '"Rannen-Triki, Amal"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources