Author: "Kahou, Samira Ebrahimi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kahou, Samira Ebrahimi"' showing total 118 results

Start Over Author "Kahou, Samira Ebrahimi"

118 results on '"Kahou, Samira Ebrahimi"'

1. Improving Quality Control Of MRI Images Using Synthetic Motion Data

Author: Bricout, Charles, Cho, Kang Ik K., Harms, Michael, Pasternak, Ofer, Bearden, Carrie E., McGorry, Patrick D., Kahn, Rene S., Kane, John, Nelson, Barnaby, Woods, Scott W., Shenton, Martha E., Bouix, Sylvain, and Kahou, Samira Ebrahimi
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: MRI quality control (QC) is challenging due to unbalanced and limited datasets, as well as subjective scoring, which hinder the development of reliable automated QC systems. To address these issues, we introduce an approach that pretrains a model on synthetically generated motion artifacts before applying transfer learning for QC classification. This method not only improves the accuracy in identifying poor-quality scans but also reduces training time and resource requirements compared to training from scratch. By leveraging synthetic data, we provide a more robust and resource-efficient solution for QC automation in MRI, paving the way for broader adoption in diverse research settings., Comment: Accepted at ISBI 2025
Published: 2025

2. Comparative Analysis of Diffusion Generative Models in Computational Pathology

Author: Thakkar, Denisha, Trinh, Vincent Quoc-Huy, Varma, Sonal, Kahou, Samira Ebrahimi, Rivaz, Hassan, and Hosseini, Mahdi S.
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Diffusion Generative Models (DGM) have rapidly surfaced as emerging topics in the field of computer vision, garnering significant interest across a wide array of deep learning applications. Despite their high computational demand, these models are extensively utilized for their superior sample quality and robust mode coverage. While research in diffusion generative models is advancing, exploration within the domain of computational pathology and its large-scale datasets has been comparatively gradual. Bridging the gap between the high-quality generation capabilities of Diffusion Generative Models and the intricate nature of pathology data, this paper presents an in-depth comparative analysis of diffusion methods applied to a pathology dataset. Our analysis extends to datasets with varying Fields of View (FOV), revealing that DGMs are highly effective in producing high-quality synthetic data. An ablative study is also conducted, followed by a detailed discussion on the impact of various methods on the synthesized histopathology images. One striking observation from our experiments is how the adjustment of image size during data generation can simulate varying fields of view. These findings underscore the potential of DGMs to enhance the quality and diversity of synthetic pathology data, especially when used with real data, ultimately increasing accuracy of deep learning models in histopathology. Code is available from https://github.com/AtlasAnalyticsLab/Diffusion4Path, Comment: Submitted paper under review
Published: 2024

3. Adaptive Group Robust Ensemble Knowledge Distillation

Author: Kenfack, Patrik, Aïvodji, Ulrich, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Neural networks can learn spurious correlations in the data, often leading to performance disparity for underrepresented subgroups. Studies have demonstrated that the disparity is amplified when knowledge is distilled from a complex teacher model to a relatively "simple" student model. Prior work has shown that ensemble deep learning methods can improve the performance of the worst-case subgroups; however, it is unclear if this advantage carries over when distilling knowledge from an ensemble of teachers, especially when the teacher models are debiased. This study demonstrates that traditional ensemble knowledge distillation can significantly drop the performance of the worst-case subgroups in the distilled student model even when the teacher models are debiased. To overcome this, we propose Adaptive Group Robust Ensemble Knowledge Distillation (AGRE-KD), a simple ensembling strategy to ensure that the student model receives knowledge beneficial for unknown underrepresented subgroups. Leveraging an additional biased model, our method selectively chooses teachers whose knowledge would better improve the worst-performing subgroups by upweighting the teachers with gradient directions deviating from the biased model. Our experiments on several datasets demonstrate the superiority of the proposed ensemble distillation technique and show that it can even outperform classic model ensembles based on majority voting., Comment: Workshop Algorithmic Fairness through the Lens of Metrics and Evaluation at NeurIPS 2024
Published: 2024

4. KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation

Author: Azimi, Rambod, Rishav, Rishav, Teichmann, Marek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Large language models (LLMs) have demonstrated remarkable performance across various downstream tasks. However, the high computational and memory requirements of LLMs are a major bottleneck. To address this, parameter-efficient fine-tuning (PEFT) methods such as low-rank adaptation (LoRA) have been proposed to reduce computational costs while ensuring minimal loss in performance. Additionally, knowledge distillation (KD) has been a popular choice for obtaining compact student models from teacher models. In this work, we present KD-LoRA, a novel fine-tuning method that combines LoRA with KD. Our results demonstrate that KD-LoRA achieves performance comparable to full fine-tuning (FFT) and LoRA while significantly reducing resource requirements. Specifically, KD-LoRA retains 98% of LoRA's performance on the GLUE benchmark, while being 40% more compact. Additionally, KD-LoRA reduces GPU memory usage by 30% compared to LoRA, while decreasing inference time by 30% compared to both FFT and LoRA. We evaluate KD-LoRA across three encoder-only models: BERT, RoBERTa, and DeBERTaV3. Code is available at https://github.com/rambodazimi/KD-LoRA., Comment: Accepted at 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (ENLSP-IV 2024)
Published: 2024

5. Prediction of Final Phosphorus Content of Steel in a Scrap-Based Electric Arc Furnace Using Artificial Neural Networks

Author: Azzaz, Riadh, Hurel, Valentin, Menard, Patrice, Jahazi, Mohammad, Kahou, Samira Ebrahimi, and Moosavi-Khoonsari, Elmira
Subjects: Computer Science - Machine Learning, Condensed Matter - Materials Science
Abstract: The scrap-based electric arc furnace process is expected to capture a significant share of the steel market in the future due to its potential for reducing environmental impacts through steel recycling. However, managing impurities, particularly phosphorus, remains a challenge. This study aims to develop a machine learning model to estimate the steel phosphorus content at the end of the process based on input parameters. Data were collected over two years from a steel plant, focusing on the chemical composition and weight of the scrap, the volume of oxygen injected, and process duration. After preprocessing the data, several machine learning models were evaluated, with the artificial neural network (ANN) emerging as the most effective. The best ANN model included four hidden layers. The model was trained for 500 epochs with a batch size of 50. The best model achieves a mean square error (MSE) of 0.000016, a root-mean-square error (RMSE) of 0.0049998, a coefficient of determination (R2) of 99.96%, and a correlation coefficient (r) of 99.98%. Notably, the model achieved a 100% hit rate for predicting phosphorus content within +-0.001 wt% (+-10 ppm). These results demonstrate that the optimized ANN model offers accurate predictions for the steel final phosphorus content., Comment: 53 pages, 8 figures
Published: 2024

6. Learning Multi-agent Multi-machine Tending by Mobile Robots

Author: Abdalwhab, Abdalwhab, Beltrame, Giovanni, Kahou, Samira Ebrahimi, and St-Onge, David
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions., Comment: 8 pages, 4 figures, Accepted at an AAAI workshop (The Multi-Agent AI in the Real World Workshop)
Published: 2024

7. Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment

Author: Rahman, Aamer Abdul, Agarwal, Pranav, Noumeir, Rita, Jouvet, Philippe, Michalski, Vincent, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Offline reinforcement learning has shown promise for solving tasks in safety-critical settings, such as clinical decision support. Its application, however, has been limited by the lack of interpretability and interactivity for clinicians. To address these challenges, we propose the medical decision transformer (MeDT), a novel and versatile framework based on the goal-conditioned reinforcement learning paradigm for sepsis treatment recommendation. MeDT uses the decision transformer architecture to learn a policy for drug dosage recommendation. During offline training, MeDT utilizes collected treatment trajectories to predict administered treatments for each time step, incorporating known treatment outcomes, target acuity scores, past treatment decisions, and current and past medical states. This analysis enables MeDT to capture complex dependencies among a patient's medical history, treatment decisions, outcomes, and short-term effects on stability. Our proposed conditioning uses acuity scores to address sparse reward issues and to facilitate clinician-model interactions, enhancing decision-making. Following training, MeDT can generate tailored treatment recommendations by conditioning on the desired positive outcome (survival) and user-specified short-term stability improvements. We carry out rigorous experiments on data from the MIMIC-III dataset and use off-policy evaluation to demonstrate that MeDT recommends interventions that outperform or are competitive with existing offline reinforcement learning methods while enabling a more interpretable, personalized and clinician-directed approach.
Published: 2024

8. Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Author: Subramanian, Jithendaraa, Sujit, Shivakanth, Irtisam, Niloy, Sain, Umong, Islam, Riashat, Nowrouzezahrai, Derek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Quantitative Biology - Biomolecules
Abstract: Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large search spaces, learning a mutation policy to generate novel sequences with reinforcement learning is appealing. Recent advances in protein language models (PLMs) trained on large corpora of protein sequences offer a potential solution to this problem by scoring proteins according to their biological plausibility (such as the TM-score). In this work, we propose to use PLMs as a reward function to generate new sequences. Yet the PLM can be computationally expensive to query due to its large size. To this end, we propose an alternative paradigm where optimization can be performed on scores from a smaller proxy model that is periodically finetuned, jointly while learning the mutation policy. We perform extensive experiments on various sequence lengths to benchmark RL-based approaches, and provide comprehensive evaluations along biological plausibility and diversity of the protein. Our experimental results include favorable evaluations of the proposed sequences, along with high diversity scores, demonstrating that RL is a strong candidate for biological sequence design. Finally, we provide a modular open source implementation can be easily integrated in most RL training loops, with support for replacing the reward model with other PLMs, to spur further research in this domain. The code for all experiments is provided in the supplementary material., Comment: 22 pages, 7 figures, 4 tables
Published: 2024

9. Learning to Play Atari in a World of Tokens

Author: Agarwal, Pranav, Andrews, Sheldon, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extended context, resulting in more accurate world models. However, for complex reasoning and planning tasks, these methods primarily rely on continuous representations. This complicates modeling of discrete properties of the real world such as disjoint object classes between which interpolation is not plausible. In this work, we introduce discrete abstract representations for transformer-based learning (DART), a sample-efficient method utilizing discrete representations for modeling both the world and learning behavior. We incorporate a transformer-decoder for auto-regressive world modeling and a transformer-encoder for learning behavior by attending to task-relevant cues in the discrete representation of the world model. For handling partial observability, we aggregate information from past time steps as memory tokens. DART outperforms previous state-of-the-art methods that do not use look-ahead search on the Atari 100k sample efficiency benchmark with a median human-normalized score of 0.790 and beats humans in 9 out of 26 games. We release our code at https://pranaval.github.io/DART/., Comment: Accepted at ICML 2024
Published: 2024

10. On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

Author: Armengol-Estapé, Jordi, Michalski, Vincent, Kumar, Ramnath, St-Charles, Pierre-Luc, Precup, Doina, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that cross-modal learning can improve representations for few-shot classification. More specifically, language is a rich modality that can be used to guide visual learning. In this work, we experiment with a multi-modal architecture for few-shot learning that consists of three components: a classifier, an auxiliary network, and a bridge network. While the classifier performs the main classification task, the auxiliary network learns to predict language representations from the same input, and the bridge network transforms high-level features of the auxiliary network into modulation parameters for layers of the few-shot classifier using conditional batch normalization. The bridge should encourage a form of lightweight semantic alignment between language and vision which could be useful for the classifier. However, after evaluating the proposed approach on two popular few-shot classification benchmarks we find that a) the improvements do not reproduce across benchmarks, and b) when they do, the improvements are due to the additional compute and parameters introduced by the bridge network. We contribute insights and recommendations for future work in multi-modal meta-learning, especially when using language representations.
Published: 2024

11. Predicting End-Point Phosphorus Content in Electric Arc Furnace Steel with Artificial Neural Networks

Author: Azzaz, Riadh, Gallego, Paloma Isabel, Jahazi, Mohammad, Kahou, Samira Ebrahimi, Moosavi-Khoonsari, Elmira, and Metallurgy and Materials Society of CIM, editor
Published: 2025
Full Text: View/download PDF

12. Spectral Temporal Contrastive Learning

Author: Morin, Sacha, Nath, Somjit, Kahou, Samira Ebrahimi, and Wolf, Guy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Learning useful data representations without requiring labels is a cornerstone of modern deep learning. Self-supervised learning methods, particularly contrastive learning (CL), have proven successful by leveraging data augmentations to define positive pairs. This success has prompted a number of theoretical studies to better understand CL and investigate theoretical bounds for downstream linear probing tasks. This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts. In this paper, we adapt recent work on Spectral CL to formulate Spectral Temporal Contrastive Learning (STCL). We discuss a population loss based on a state graph derived from a time-homogeneous reversible Markov chain with uniform stationary distribution. The STCL loss enables to connect the linear probing performance to the spectral properties of the graph, and can be estimated by considering previously observed data sequences as an ensemble of MCMC chains., Comment: Accepted to Self-Supervised Learning - Theory and Practice, NeurIPS Workshop, 2023
Published: 2023

13. Auxiliary Losses for Learning Generalizable Concept-based Models

Author: Sheth, Ivaxi and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: The increasing use of neural networks in various applications has lead to increasing apprehensions, underscoring the necessity to understand their operations beyond mere final predictions. As a solution to enhance model transparency, Concept Bottleneck Models (CBMs) have gained popularity since their introduction. CBMs essentially limit the latent space of a model to human-understandable high-level concepts. While beneficial, CBMs have been reported to often learn irrelevant concept representations that consecutively damage model performance. To overcome the performance trade-off, we propose cooperative-Concept Bottleneck Model (coop-CBM). The concept representation of our model is particularly meaningful when fine-grained concept labels are absent. Furthermore, we introduce the concept orthogonal loss (COL) to encourage the separation between the concept representations and to reduce the intra-concept distance. This paper presents extensive experiments on real-world datasets for image classification tasks, namely CUB, AwA2, CelebA and TIL. We also study the performance of coop-CBM models under various distributional shift settings. We show that our proposed method achieves higher accuracy in all distributional shift settings even compared to the black-box models with the highest concept accuracy., Comment: Neurips 2023
Published: 2023

14. Transparent Anomaly Detection via Concept-based Explanations

Author: Sevyeri, Laya Rafiee, Sheth, Ivaxi, Farahnak, Farhood, Kahou, Samira Ebrahimi, and Enger, Shirin Abbasinejad
Subjects: Computer Science - Machine Learning
Abstract: Advancements in deep learning techniques have given a boost to the performance of anomaly detection. However, real-world and safety-critical applications demand a level of transparency and reasoning beyond accuracy. The task of anomaly detection (AD) focuses on finding whether a given sample follows the learned distribution. Existing methods lack the ability to reason with clear explanations for their outcomes. Hence to overcome this challenge, we propose Transparent {A}nomaly Detection {C}oncept {E}xplanations (ACE). ACE is able to provide human interpretable explanations in the form of concepts along with anomaly prediction. To the best of our knowledge, this is the first paper that proposes interpretable by-design anomaly detection. In addition to promoting transparency in AD, it allows for effective human-model interaction. Our proposed model shows either higher or comparable results to black-box uninterpretable models. We validate the performance of ACE across three realistic datasets - bird classification on CUB-200-2011, challenging histopathology slide image classification on TIL-WSI-TCGA, and gender classification on CelebA. We further demonstrate that our concept learning paradigm can be seamlessly integrated with other classification-based AD methods., Comment: Accepted at Neurips XAI in Action workshop
Published: 2023

15. Fairness Under Demographic Scarce Regime

Author: Kenfack, Patrik Joslin, Kahou, Samira Ebrahimi, and Aïvodji, Ulrich
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographic information is partially available because a record was not maintained throughout data collection or for privacy reasons. This setting is known as demographic scarce regime. Prior research has shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, using proxy-sensitive attributes worsens fairness-accuracy tradeoffs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy tradeoffs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes can negatively impact the fairness-accuracy tradeoff. Our experiments on five datasets showed that the proposed framework yields models with significantly better fairness-accuracy tradeoffs than classic attribute classifiers. Surprisingly, our framework can outperform models trained with fairness constraints on the true sensitive attributes in most benchmarks. We also show that these findings are consistent with other uncertainty measures such as conformal prediction., Comment: Published in Transactions on Machine Learning Research (TMLR), 2024
Published: 2023

16. Transformers in Reinforcement Learning: A Survey

Author: Agarwal, Pranav, Rahman, Aamer Abdul, St-Charles, Pierre-Luc, Prince, Simon J. D., and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability. We begin by providing a brief domain overview of RL, followed by a discussion on the challenges of classical RL algorithms. Next, we delve into the properties of the transformer and its variants and discuss the characteristics that make them well-suited to address the challenges inherent in RL. We examine the application of transformers to various aspects of RL, including representation learning, transition and reward function modeling, and policy optimization. We also discuss recent research that aims to enhance the interpretability and efficiency of transformers in RL, using visualization techniques and efficient training strategies. Often, the transformer architecture must be tailored to the specific needs of a given application. We present a broad overview of how transformers have been adapted for several applications, including robotics, medicine, language modeling, cloud computing, and combinatorial optimization. We conclude by discussing the limitations of using transformers in RL and assess their potential for catalyzing future breakthroughs in this field., Comment: 35 pages, 11 figures
Published: 2023

17. CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning

Author: Gupta, Nikunj, Nath, Somjit, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other agents and utilizing a notion of a guarantee or confidence about the behavior of the system. In this article, we propose a novel multi-agent reinforcement learning (MARL) algorithm CAMMARL, which involves modeling the actions of other agents in different situations in the form of confident sets, i.e., sets containing their true actions with a high probability. We then use these estimates to inform an agent's decision-making. For estimating such sets, we use the concept of conformal predictions, by means of which, we not only obtain an estimate of the most probable outcome but get to quantify the operable uncertainty as well. For instance, we can predict a set that provably covers the true predictions with high probabilities (e.g., 95%). Through several experiments in two fully cooperative multi-agent tasks, we show that CAMMARL elevates the capabilities of an autonomous agent in MARL by modeling conformal prediction sets over the behavior of other agents in the environment and utilizing such estimates to enhance its policy learning.
Published: 2023

18. Discovering Object-Centric Generalized Value Functions From Pixels

Author: Nath, Somjit, Subbaraj, Gopeshh Raaj, Khetarpal, Khimya, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent "question" functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation., Comment: Accepted at ICML 2023
Published: 2023

19. Source-free Domain Adaptation Requires Penalized Diversity

Author: Sevyeri, Laya Rafiee, Sheth, Ivaxi, Farahnak, Farhood, See, Alexandre, Kahou, Samira Ebrahimi, Fevens, Thomas, and Havaei, Mohammad
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
Published: 2023

20. Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies

Author: Sujit, Shivakanth, Braga, Pedro H. M., Bornschein, Jorg, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive; such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there exists no single established protocol for evaluating offline RL methods.In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets., Comment: TMLR 2023
Published: 2022

21. Pitfalls of Conditional Batch Normalization for Contextual Multi-Modal Learning

Author: Sheth, Ivaxi, Rahman, Aamer Abdul, Havaei, Mohammad, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Humans have perfected the art of learning from multiple modalities through sensory organs. Despite their impressive predictive performance on a single modality, neural networks cannot reach human level accuracy with respect to multiple modalities. This is a particularly challenging task due to variations in the structure of respective modalities. Conditional Batch Normalization (CBN) is a popular method that was proposed to learn contextual features to aid deep learning tasks. This technique uses auxiliary data to improve representational power by learning affine transformations for convolutional neural networks. Despite the boost in performance observed by using CBN layers, our work reveals that the visual features learned by introducing auxiliary data via CBN deteriorates. We perform comprehensive experiments to evaluate the brittleness of CBN networks to various datasets, suggesting that learning from visual features alone could often be superior for generalization. We evaluate CBN models on natural images for bird classification and histology images for cancer type classification. We observe that the CBN network learns close to no visual features on the bird classification dataset and partial visual features on the histology dataset. Our extensive experiments reveal that CBN may encourage shortcut learning between the auxiliary data and labels., Comment: Accepted at ICBINB workshop @ NeurIPS 2022
Published: 2022

22. Automatic Evaluation of Excavator Operators using Learned Reward Functions

Author: Agarwal, Pranav, Teichmann, Marek, Andrews, Sheldon, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Training novice users to operate an excavator for learning different skills requires the presence of expert teachers. Considering the complexity of the problem, it is comparatively expensive to find skilled experts as the process is time-consuming and requires precise focus. Moreover, since humans tend to be biased, the evaluation process is noisy and will lead to high variance in the final score of different operators with similar skills. In this work, we address these issues and propose a novel strategy for the automatic evaluation of excavator operators. We take into account the internal dynamics of the excavator and the safety criterion at every time step to evaluate the performance. To further validate our approach, we use this score prediction model as a source of reward for a reinforcement learning agent to learn the task of maneuvering an excavator in a simulated environment that closely replicates the real-world dynamics. For a policy learned using these external reward prediction models, our results demonstrate safer solutions following the required dynamic constraints when compared to policy trained with task-based reward functions only, making it one step closer to real-life adoption. For future research, we release our codebase at https://github.com/pranavAL/InvRL_Auto-Evaluate and video results https://drive.google.com/file/d/1jR1otOAu8zrY8mkhUOUZW9jkBOAKK71Z/view?usp=share_link ., Comment: 11 pages, 5 figures, Accepted at Reinforcement Learning for Real Life (RL4RealLife) Workshop at NeurIPS 2022
Published: 2022

23. BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

Author: Fayyaz, Mohsen, Aghazadeh, Ehsan, Modarressi, Ali, Pilehvar, Mohammad Taher, Yaghoobzadeh, Yadollah, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computation and Language
Abstract: Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based scoring metrics for finding important examples are GraNd and its estimated version, EL2N. In this work, we employ these two metrics for the first time in NLP. We demonstrate that these metrics need to be computed after at least one epoch of fine-tuning and they are not reliable in early steps. Furthermore, we show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it. This paper details adjustments and implementation choices which enable GraNd and EL2N to be applied to NLP., Comment: ENLSP @ NeurIPS2022
Published: 2022

24. Learning Latent Structural Causal Models

Author: Subramanian, Jithendaraa, Annadani, Yashas, Sheth, Ivaxi, Ke, Nan Rosemary, Deleu, Tristan, Bauer, Stefan, Nowrouzezahrai, Derek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better explanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model., Comment: 21 pages, 19 figures
Published: 2022

25. Locally Constrained Representations in Reinforcement Learning

Author: Nath, Somjit, Arora, Rushiv, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The success of Reinforcement Learning (RL) heavily relies on the ability to learn robust representations from the observations of the environment. In most cases, the representations learned purely by the reinforcement learning loss can differ vastly across states depending on how the value functions change. However, the representations learned need not be very specific to the task at hand. Relying only on the RL objective may yield representations that vary greatly across successive time steps. In addition, since the RL loss has a changing target, the representations learned would depend on how good the current values/policies are. Thus, disentangling the representations from the main task would allow them to focus not only on the task-specific features but also the environment dynamics. To this end, we propose locally constrained representations, where an auxiliary loss forces the state representations to be predictable by the representations of the neighboring states. This encourages the representations to be driven not only by the value/policy learning but also by an additional loss that constrains the representations from over-fitting to the value loss. We evaluate the proposed method on several known benchmarks and observe strong performance. Especially in continuous control tasks, our experiments show a significant performance improvement.
Published: 2022

26. Prioritizing Samples in Reinforcement Learning with Reducible Loss

Author: Sujit, Shivakanth, Nath, Somjit, Braga, Pedro H. M., and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed in the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a na\"ive strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample. We define the learn-ability of a sample as the steady decrease of the training loss associated with this sample over time. We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn, typically caused by noise or stochasticity. We empirically show that our method is more robust than random sampling and also better than just prioritizing with respect to the training loss, i.e. the temporal difference loss, which is used in prioritized experience replay., Comment: NeurIPS 2023
Published: 2022

27. Latent Variable Models for Bayesian Causal Discovery

Author: Subramanian, Jithendaraa, Annadani, Yashas, Sheth, Ivaxi, Bauer, Stefan, Nowrouzezahrai, Derek, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Learning predictors that do not rely on spurious correlations involves building causal representations. However, learning such a representation is very challenging. We, therefore, formulate the problem of learning a causal representation from high dimensional data and study causal recovery with synthetic data. This work introduces a latent variable decoder model, Decoder BCD, for Bayesian causal discovery and performs experiments in mildly supervised and unsupervised settings. We present a series of synthetic experiments to characterize important factors for causal discovery and show that using known intervention targets as labels helps in unsupervised Bayesian inference over structure and parameters of linear Gaussian additive noise latent structural causal models., Comment: 7 figures, Published at the ICML 2022 Workshop on Spurious Correlations, Invariance, and Stability
Published: 2022

28. FHIST: A Benchmark for Few-shot Classification of Histological Images

Author: Shakeri, Fereshteh, Boudiaf, Malik, Mohammadi, Sina, Sheth, Ivaxi, Havaei, Mohammad, Ayed, Ismail Ben, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Few-shot learning has recently attracted wide interest in image classification, but almost all the current public benchmarks are focused on natural images. The few-shot paradigm is highly relevant in medical-imaging applications due to the scarcity of labeled data, as annotations are expensive and require specialized expertise. However, in medical imaging, few-shot learning research is sparse, limited to private data sets and is at its early stage. In particular, the few-shot setting is of high interest in histology due to the diversity and fine granularity of cancer related tissue classification tasks, and the variety of data-preparation techniques. This paper introduces a highly diversified public benchmark, gathered from various public datasets, for few-shot histology data classification. We build few-shot tasks and base-training data with various tissue types, different levels of domain shifts stemming from various cancer sites, and different class-granularity levels, thereby reflecting realistic scenarios. We evaluate the performances of state-of-the-art few-shot learning methods on our benchmark, and observe that simple fine-tuning and regularization methods achieve better results than the popular meta-learning and episodic-training paradigm. Furthermore, we introduce three scenarios based on the domain shifts between the source and target histology data: near-domain, middle-domain and out-domain. Our experiments display the potential of few-shot learning in histology classification, with state-of-art few shot learning methods approaching the supervised-learning baselines in the near-domain setting. In our out-domain setting, for 5-way 5-shot, the best performing method reaches 60% accuracy. We believe that our work could help in building realistic evaluations and fair comparisons of few-shot learning methods and will further encourage research in the few-shot paradigm., Comment: Code available at: https://github.com/mboudiaf/Few-shot-histology
Published: 2022

29. The Sandbox Environment for Generalizable Agent Research (SEGAR)

Author: Hjelm, R Devon, Mazoure, Bogdan, Golemo, Florian, Kahou, Samira Ebrahimi, Braga, Pedro, Frujeri, Felipe, Jalobeanu, Mihai, and Kolobov, Andrey
Subjects: Computer Science - Machine Learning
Abstract: A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress. While there has been notable headway, current benchmarks either do not provide suitable exposure nor intuitive control of the underlying factors, are not easy-to-implement, customizable, or extensible, or are computationally expensive to run. We built the Sandbox Environment for Generalizable Agent Research (SEGAR) with all of these things in mind. SEGAR improves the ease and accountability of generalization research in RL, as generalization objectives can be easy designed by specifying task distributions, which in turns allows the researcher to measure the nature of the generalization objective. We present an overview of SEGAR and how it contributes to these goals, as well as experiments that demonstrate a few types of research questions SEGAR can help answer.
Published: 2022

30. Simple Video Generation using Neural ODEs

Author: Kanaa, David, Voleti, Vikram, Kahou, Samira Ebrahimi, and Pal, Christopher
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Despite having been studied to a great extent, the task of conditional generation of sequences of frames, or videos, remains extremely challenging. It is a common belief that a key step towards solving this task resides in modelling accurately both spatial and temporal information in video signals. A promising direction to do so has been to learn latent variable models that predict the future in latent space and project back to pixels, as suggested in recent literature. Following this line of work and building on top of a family of models introduced in prior work, Neural ODE, we investigate an approach that models time-continuous dynamics over a continuous latent space with a differential equation with respect to time. The intuition behind this approach is that these trajectories in latent space could then be extrapolated to generate video frames beyond the time steps for which the model is trained. We show that our approach yields promising results in the task of future frame prediction on the Moving MNIST dataset with 1 and 2 digits., Comment: 8 pages, 4 figures, NeurIPS 2019 workshop
Published: 2021

31. Accounting for Variance in Machine Learning Benchmarks

Author: Bouthillier, Xavier, Delaunay, Pierre, Bronzi, Mirko, Trofimov, Assya, Nichyporuk, Brennan, Szeto, Justin, Sepah, Naz, Raff, Edward, Madan, Kanika, Voleti, Vikram, Kahou, Samira Ebrahimi, Michalski, Vincent, Serdyuk, Dmitriy, Arbel, Tal, Pal, Chris, Varoquaux, Gaël, and Vincent, Pascal
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons., Comment: Submitted to MLSys2021
Published: 2021

32. Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction

Author: Girgis, Roger, Golemo, Florian, Codevilla, Felipe, Weiss, Martin, D'Souza, Jim Aldon, Kahou, Samira Ebrahimi, Heide, Felix, and Pal, Christopher
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Robust multi-agent trajectory prediction is essential for the safe control of robotic systems. A major challenge is to efficiently learn a representation that approximates the true joint distribution of contextual, social, and temporal information to enable planning. We propose Latent Variable Sequential Set Transformers which are encoder-decoder architectures that generate scene-consistent multi-agent trajectories. We refer to these architectures as "AutoBots". The encoder is a stack of interleaved temporal and social multi-head self-attention (MHSA) modules which alternately perform equivariant processing across the temporal and social dimensions. The decoder employs learnable seed parameters in combination with temporal and social MHSA modules allowing it to perform inference over the entire future scene in a single forward pass efficiently. AutoBots can produce either the trajectory of one ego-agent or a distribution over the future trajectories for all agents in the scene. For the single-agent prediction case, our model achieves top results on the global nuScenes vehicle motion prediction leaderboard, and produces strong results on the Argoverse vehicle prediction challenge. In the multi-agent setting, we evaluate on the synthetic partition of TrajNet++ dataset to showcase the model's socially-consistent predictions. We also demonstrate our model on general sequences of sets and provide illustrative experiments modelling the sequential structure of the multiple strokes that make up symbols in the Omniglot data. A distinguishing feature of AutoBots is that all models are trainable on a single desktop GPU (1080 Ti) in under 48h., Comment: 26 pages, 17 figures, 8 tables
Published: 2021

33. Predicting Regional Locust Swarm Distribution with Recurrent Neural Networks

Author: Samil, Hadia Mohmmed Osman Ahmed, Martin, Annabelle, Jain, Arnav Kumar, Amin, Susan, and Kahou, Samira Ebrahimi
Subjects: Computer Science - Machine Learning
Abstract: Locust infestation of some regions in the world, including Africa, Asia and Middle East has become a concerning issue that can affect the health and the lives of millions of people. In this respect, there have been attempts to resolve or reduce the severity of this problem via detection and monitoring of locust breeding areas using satellites and sensors, or the use of chemicals to prevent the formation of swarms. However, such methods have not been able to suppress the emergence and the collective behaviour of locusts. The ability to predict the location of the locust swarms prior to their formation, on the other hand, can help people get prepared and tackle the infestation issue more effectively. Here, we use machine learning to predict the location of locust swarms using the available data published by the Food and Agriculture Organization of the United Nations. The data includes the location of the observed swarms as well as environmental information, including soil moisture and the density of vegetation. The obtained results show that our proposed model can successfully, and with reasonable precision, predict the location of locust swarms, as well as their likely level of damage using a notion of density.
Published: 2020

34. Neural semantic tagging for natural language-based search in building information models: Implications for practice

Author: Shahinmoghadam, Mehrzad, Kahou, Samira Ebrahimi, and Motamedi, Ali
Published: 2024
Full Text: View/download PDF

35. An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation

Author: Michalski, Vincent, Voleti, Vikram, Kahou, Samira Ebrahimi, Ortiz, Anthony, Vincent, Pascal, Pal, Chris, and Precup, Doina
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Batch normalization has been widely used to improve optimization in deep neural networks. While the uncertainty in batch statistics can act as a regularizer, using these dataset statistics specific to the training set impairs generalization in certain tasks. Recently, alternative methods for normalizing feature activations in neural networks have been proposed. Among them, group normalization has been shown to yield similar, in some domains even superior performance to batch normalization. All these methods utilize a learned affine transformation after the normalization operation to increase representational power. Methods used in conditional computation define the parameters of these transformations as learnable functions of conditioning information. In this work, we study whether and where the conditional formulation of group normalization can improve generalization compared to conditional batch normalization. We evaluate performances on the tasks of visual question answering, few-shot learning, and conditional image generation.
Published: 2019

36. Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies

Author: Chandar, Sarath, Sankar, Chinnadhurai, Vorontsov, Eugene, Kahou, Samira Ebrahimi, and Bengio, Yoshua
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish during training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures., Comment: In Proceedings of AAAI 2019
Published: 2019

37. Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction

Author: El-Nouby, Alaaeldin, Sharma, Shikhar, Schulz, Hannes, Hjelm, Devon, Asri, Layla El, Kahou, Samira Ebrahimi, Bengio, Yoshua, and Taylor, Graham W.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Conditional text-to-image generation is an active area of research, with many possible applications. Existing research has primarily focused on generating a single image from available conditioning information in one step. One practical extension beyond one-step generation is a system that generates an image iteratively, conditioned on ongoing linguistic input or feedback. This is significantly more challenging than one-step generation tasks, as such a system must understand the contents of its generated images with respect to the feedback history, the current feedback, as well as the interactions among concepts present in the feedback history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, and apply simple transformations to existing objects. We believe our approach is an important step toward interactive generation. Code and data is available at: https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/ ., Comment: Accepted at ICCV 2019
Published: 2018

38. ChatPainter: Improving Text to Image Generation using Dialogue

Author: Sharma, Shikhar, Suhubdy, Dendi, Michalski, Vincent, Kahou, Samira Ebrahimi, and Bengio, Yoshua
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can contain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the images correspond to which words in the captions. We show that adding a dialogue that further describes the scene leads to significant improvement in the inception score and in the quality of generated images on the MS COCO dataset.
Published: 2018

39. FigureQA: An Annotated Figure Dataset for Visual Reasoning

Author: Kahou, Samira Ebrahimi, Michalski, Vincent, Atkinson, Adam, Kadar, Akos, Trischler, Adam, and Bengio, Yoshua
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are synthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as a strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step towards developing models that can intuitively recognize patterns from visual representations of data., Comment: workshop paper at ICLR 2018
Published: 2017

40. The 'something something' video database for learning and evaluating visual common sense

Author: Goyal, Raghav, Kahou, Samira Ebrahimi, Michalski, Vincent, Materzyńska, Joanna, Westphal, Susanne, Kim, Heuna, Haenel, Valentin, Fruend, Ingo, Yianilos, Peter, Mueller-Freitag, Moritz, Hoppe, Florian, Thurau, Christian, Bax, Ingo, and Memisevic, Roland
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification. One obstacle that prevents networks from reasoning more deeply about complex scenes and situations, and from integrating visual knowledge with natural language, like humans do, is their lack of common sense knowledge about the physical world. Videos, unlike still images, contain a wealth of detailed information about the physical world. However, most labelled video datasets represent high-level concepts rather than detailed physical aspects about actions and scenes. In this work, we describe our ongoing collection of the "something-something" database of video prediction tasks whose solutions require a common sense understanding of the depicted situation. The database currently contains more than 100,000 videos across 174 classes, which are defined as caption-templates. We also describe the challenges in crowd-sourcing this data at scale.
Published: 2017

41. ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events

Author: Racah, Evan, Beckham, Christopher, Maharaj, Tegan, Kahou, Samira Ebrahimi, Prabhat, and Pal, Christopher
Subjects: Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Then detection and identification of extreme weather events in large-scale climate simulations is an important problem for risk management, informing governmental policy decisions and advancing our basic understanding of the climate system. Recent work has shown that fully supervised convolutional neural networks (CNNs) can yield acceptable accuracy for classifying well-known types of extreme weather events when large amounts of labeled data are available. However, many different types of spatially localized climate patterns are of interest including hurricanes, extra-tropical cyclones, weather fronts, and blocking events among others. Existing labeled data for these patterns can be incomplete in various ways, such as covering only certain years or geographic areas and having false negatives. This type of climate data therefore poses a number of interesting machine learning challenges. We present a multichannel spatiotemporal CNN architecture for semi-supervised bounding box prediction and exploratory data analysis. We demonstrate that our approach is able to leverage temporal information and unlabeled data to improve the localization of extreme weather events. Further, we explore the representations learned by our model in order to better understand this important data. We present a dataset, ExtremeWeather, to encourage machine learning research in this area and to help facilitate further work in understanding and mitigating the effects of climate change. The dataset is available at extremeweatherdataset.github.io and the code is available at https://github.com/eracah/hur-detect.
Published: 2016

42. Theano: A Python framework for fast computation of mathematical expressions

Author: The Theano Development Team, Al-Rfou, Rami, Alain, Guillaume, Almahairi, Amjad, Angermueller, Christof, Bahdanau, Dzmitry, Ballas, Nicolas, Bastien, Frédéric, Bayer, Justin, Belikov, Anatoly, Belopolsky, Alexander, Bengio, Yoshua, Bergeron, Arnaud, Bergstra, James, Bisson, Valentin, Snyder, Josh Bleecher, Bouchard, Nicolas, Boulanger-Lewandowski, Nicolas, Bouthillier, Xavier, de Brébisson, Alexandre, Breuleux, Olivier, Carrier, Pierre-Luc, Cho, Kyunghyun, Chorowski, Jan, Christiano, Paul, Cooijmans, Tim, Côté, Marc-Alexandre, Côté, Myriam, Courville, Aaron, Dauphin, Yann N., Delalleau, Olivier, Demouth, Julien, Desjardins, Guillaume, Dieleman, Sander, Dinh, Laurent, Ducoffe, Mélanie, Dumoulin, Vincent, Kahou, Samira Ebrahimi, Erhan, Dumitru, Fan, Ziye, Firat, Orhan, Germain, Mathieu, Glorot, Xavier, Goodfellow, Ian, Graham, Matt, Gulcehre, Caglar, Hamel, Philippe, Harlouchet, Iban, Heng, Jean-Philippe, Hidasi, Balázs, Honari, Sina, Jain, Arjun, Jean, Sébastien, Jia, Kai, Korobov, Mikhail, Kulkarni, Vivek, Lamb, Alex, Lamblin, Pascal, Larsen, Eric, Laurent, César, Lee, Sean, Lefrancois, Simon, Lemieux, Simon, Léonard, Nicholas, Lin, Zhouhan, Livezey, Jesse A., Lorenz, Cory, Lowin, Jeremiah, Ma, Qianli, Manzagol, Pierre-Antoine, Mastropietro, Olivier, McGibbon, Robert T., Memisevic, Roland, van Merriënboer, Bart, Michalski, Vincent, Mirza, Mehdi, Orlandi, Alberto, Pal, Christopher, Pascanu, Razvan, Pezeshki, Mohammad, Raffel, Colin, Renshaw, Daniel, Rocklin, Matthew, Romero, Adriana, Roth, Markus, Sadowski, Peter, Salvatier, John, Savard, François, Schlüter, Jan, Schulman, John, Schwartz, Gabriel, Serban, Iulian Vlad, Serdyuk, Dmitriy, Shabanian, Samira, Simon, Étienne, Spieckermann, Sigurd, Subramanyam, S. Ramana, Sygnowski, Jakub, Tanguay, Jérémie, van Tulder, Gijs, Turian, Joseph, Urban, Sebastian, Vincent, Pascal, Visin, Francesco, de Vries, Harm, Warde-Farley, David, Webb, Dustin J., Willson, Matthew, Xu, Kelvin, Xue, Lijun, Yao, Li, Zhang, Saizheng, and Zhang, Ying
Subjects: Computer Science - Symbolic Computation, Computer Science - Learning, Computer Science - Mathematical Software
Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it., Comment: 19 pages, 5 figures
Published: 2016

43. Do Deep Convolutional Nets Really Need to be Deep and Convolutional?

Author: Urban, Gregor, Geras, Krzysztof J., Kahou, Samira Ebrahimi, Aslan, Ozlem, Wang, Shengjie, Caruana, Rich, Mohamed, Abdelrahman, Philipose, Matthai, and Richardson, Matt
Subjects: Statistics - Machine Learning, Computer Science - Learning
Abstract: Yes, they do. This paper provides the first empirical demonstration that deep convolutional models really need to be both deep and convolutional, even when trained with methods such as distillation that allow small or shallow models of high accuracy to be trained. Although previous research showed that shallow feed-forward nets sometimes can learn the complex functions previously learned by deep nets while using the same number of parameters as the deep models they mimic, in this paper we demonstrate that the same methods cannot be used to train accurate models on CIFAR-10 unless the student models contain multiple layers of convolution. Although the student models do not have to be as deep as the teacher model they mimic, the students need multiple convolutional layers to learn functions of comparable accuracy as the deep convolutional teacher.
Published: 2016

44. RATM: Recurrent Attentive Tracking Model

Author: Kahou, Samira Ebrahimi, Michalski, Vincent, and Memisevic, Roland
Subjects: Computer Science - Learning
Abstract: We present an attention-based modular neural framework for computer vision. The framework uses a soft attention mechanism allowing models to be trained with gradient descent. It consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior. The attention module allows the model to focus computation on task-related information in the input. We apply the framework to several object tracking tasks and explore various design choices. We experiment with three data sets, bouncing ball, moving digits and the real-world KTH data set. The proposed Recurrent Attentive Tracking Model performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging tracking data set.
Published: 2015

45. EmoNets: Multimodal deep learning approaches for emotion recognition in video

Author: Kahou, Samira Ebrahimi, Bouthillier, Xavier, Lamblin, Pascal, Gulcehre, Caglar, Michalski, Vincent, Konda, Kishore, Jean, Sébastien, Froumenty, Pierre, Dauphin, Yann, Boulanger-Lewandowski, Nicolas, Ferrari, Raul Chandias, Mirza, Mehdi, Warde-Farley, David, Courville, Aaron, Vincent, Pascal, Memisevic, Roland, Pal, Christopher, and Bengio, Yoshua
Subjects: Computer Science - Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies. The videos depict acted-out emotions under realistic conditions with a large degree of variation in attributes such as pose and illumination, making it worthwhile to explore approaches which consider combinations of features from multiple modalities for label assignment. In this paper we present our approach to learning several specialist models using deep learning techniques, each focusing on one modality. Among these are a convolutional neural network, focusing on capturing visual information in detected faces, a deep belief net focusing on the representation of the audio stream, a K-Means based "bag-of-mouths" model, which extracts visual features around the mouth region and a relational autoencoder, which addresses spatio-temporal aspects of videos. We explore multiple methods for the combination of cues from these modalities into one common classifier. This achieves a considerably greater accuracy than predictions from our strongest single-modality classifier. Our method was the winning submission in the 2013 EmotiW challenge and achieved a test set accuracy of 47.67% on the 2014 dataset.
Published: 2015

46. FitNets: Hints for Thin Deep Nets

Author: Romero, Adriana, Ballas, Nicolas, Kahou, Samira Ebrahimi, Chassang, Antoine, Gatta, Carlo, and Bengio, Yoshua
Subjects: Computer Science - Learning, Computer Science - Neural and Evolutionary Computing
Abstract: While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.
Published: 2014

47. Facial Expression Analysis Based on High Dimensional Binary Features

Author: Kahou, Samira Ebrahimi, Froumenty, Pierre, Pal, Christopher, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Agapito, Lourdes, editor, Bronstein, Michael M., editor, and Rother, Carsten, editor
Published: 2015
Full Text: View/download PDF

48. EmoNets: Multimodal deep learning approaches for emotion recognition in video

Author: Kahou, Samira Ebrahimi, Bouthillier, Xavier, Lamblin, Pascal, Gulcehre, Caglar, Michalski, Vincent, Konda, Kishore, Jean, Sébastien, Froumenty, Pierre, Dauphin, Yann, Boulanger-Lewandowski, Nicolas, Chandias Ferrari, Raul, Mirza, Mehdi, Warde-Farley, David, Courville, Aaron, Vincent, Pascal, Memisevic, Roland, Pal, Christopher, and Bengio, Yoshua
Published: 2016
Full Text: View/download PDF

49. Revisiting Learnable Affines for Batch Norm in Few-Shot Transfer Learning

Author: Yazdanpanah, Moslem, primary, Rahman, Aamer Abdul, additional, Chaudhary, Muawiz, additional, Desrosiers, Christian, additional, Havaei, Mohammad, additional, Belilovsky, Eugene, additional, and Kahou, Samira Ebrahimi, additional
Published: 2022
Full Text: View/download PDF

50. Facial Expression Analysis Based on High Dimensional Binary Features

Author: Kahou, Samira Ebrahimi, primary, Froumenty, Pierre, additional, and Pal, Christopher, additional
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

118 results on '"Kahou, Samira Ebrahimi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources