Author: "Bouneffouf, Djallel" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Bouneffouf, Djallel"' showing total 284 results

Start Over Author "Bouneffouf, Djallel"

284 results on '"Bouneffouf, Djallel"'

1. Scopes of Alignment

Author: Varshney, Kush R., Ashktorab, Zahra, Bouneffouf, Djallel, Riemer, Matthew, and Weisz, Justin D.
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Much of the research focus on AI alignment seeks to align large language models and other foundation models to the context-less and generic values of helpfulness, harmlessness, and honesty. Frontier model providers also strive to align their models with these values. In this paper, we motivate why we need to move beyond such a limited conception and propose three dimensions for doing so. The first scope of alignment is competence: knowledge, skills, or behaviors the model must possess to be useful for its intended purpose. The second scope of alignment is transience: either semantic or episodic depending on the context of use. The third scope of alignment is audience: either mass, public, small-group, or dyadic. At the end of the paper, we use the proposed framework to position some technologies and workflows that go beyond prevailing notions of alignment., Comment: The 2nd International Workshop on AI Governance (AIGOV) held in conjunction with AAAI 2025
Published: 2025

2. Position: Theory of Mind Benchmarks are Broken for Large Language Models

Author: Riemer, Matthew, Ashktorab, Zahra, Bouneffouf, Djallel, Das, Payel, Liu, Miao, Weisz, Justin D., and Campbell, Murray
Subjects: Computer Science - Artificial Intelligence
Abstract: This position paper argues that the majority of theory of mind benchmarks are broken because of their inability to directly test how large language models (LLMs) adapt to new partners. This problem stems from the fact that theory of mind benchmarks for LLMs are overwhelmingly inspired by the methods used to test theory of mind in humans and fall victim to a fallacy of attributing human-like qualities to AI agents. We expect that humans will engage in a consistent reasoning process across various questions about a situation, but this is known to not be the case for current LLMs. Most theory of mind benchmarks only measure what we call literal theory of mind: the ability to predict the behavior of others. Measuring this kind of reasoning is very informative in testing the ability of agents with self-consistent reasoning. However, it is important to note the distinction between this and what we actually care about when this self-consistency cannot be taken for granted. We call this functional theory of mind: the ability to adapt to agents in-context following a rational response to predictions about their behavior. We find that top performing open source LLMs may display strong capabilities in literal theory of mind, depending on how they are prompted, but seem to struggle with functional theory of mind -- even when partner policies are exceedingly simple. Simply put, strong literal theory of mind performance does not necessarily imply strong functional theory of mind performance. Achieving functional theory of mind, particularly over long interaction horizons with a partner, is a significant challenge deserving a prominent role in any meaningful LLM theory of mind evaluation.
Published: 2024

3. Evaluating the Prompt Steerability of Large Language Models

Author: Miehling, Erik, Desmond, Michael, Ramamurthy, Karthikeyan Natesan, Daly, Elizabeth M., Dognin, Pierre, Rios, Jesus, Bouneffouf, Djallel, and Liu, Miao
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: Building pluralistic AI requires designing models that are able to be shaped to represent a wide range of value systems and cultures. Achieving this requires first being able to evaluate the degree to which a given model is capable of reflecting various personas. To this end, we propose a benchmark for evaluating the steerability of model personas as a function of prompting. Our design is based on a formal definition of prompt steerability, which analyzes the degree to which a model's joint behavioral distribution can be shifted from its baseline. By defining steerability indices and inspecting how these indices change as a function of steering effort, we can estimate the steerability of a model across various persona dimensions and directions. Our benchmark reveals that the steerability of many current models is limited -- due to both a skew in their baseline behavior and an asymmetry in their steerability across many persona dimensions. We release an implementation of our benchmark at https://github.com/IBM/prompt-steering., Comment: Short version appeared at the Pluralistic Alignment workshop at NeurIPS 2024; extended version appeared at NAACL 2025
Published: 2024

4. Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems

Author: Ide, Shun, Blunt, Allison, and Bouneffouf, Djallel
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the utility of AI actions, and also points towards increasing exploration as a potential improvement to such systems., Comment: Accepted into AIBS 2024: The First Workshop on AI Behavioral Science, 5 pages, 4 figures
Published: 2024

5. Sequential uncertainty quantification with contextual tensors for social targeting

Author: Idé, Tsuyoshi, Murugesan, Keerthiram, Bouneffouf, Djallel, and Abe, Naoki
Published: 2025
Full Text: View/download PDF

6. Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models

Author: Gunal, Aylin, Lin, Baihan, and Bouneffouf, Djallel
Subjects: Computer Science - Computation and Language
Abstract: Given the increasing demand for mental health assistance, artificial intelligence (AI), particularly large language models (LLMs), may be valuable for integration into automated clinical support systems. In this work, we leverage a decision transformer architecture for topic recommendation in counseling conversations between patients and mental health professionals. The architecture is utilized for offline reinforcement learning, and we extract states (dialogue turn embeddings), actions (conversation topics), and rewards (scores measuring the alignment between patient and therapist) from previous turns within a conversation to train a decision transformer model. We demonstrate an improvement over baseline reinforcement learning methods, and propose a novel system of utilizing our model's output as synthetic labels for fine-tuning a large language model for the same task. Although our implementation based on LLaMA-2 7B has mixed results, future work can undoubtedly build on the design., Comment: 5 pages excluding references, 3 figures; accepted at Clinical NLP Workshop @ NAACL 2024
Published: 2024

7. Contextual Moral Value Alignment Through Context-Based Aggregation

Author: Dognin, Pierre, Rios, Jesus, Luss, Ronny, Padhi, Inkit, Riemer, Matthew D, Liu, Miao, Sattigeri, Prasanna, Nagireddy, Manish, Varshney, Kush R., and Bouneffouf, Djallel
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Developing value-aligned AI agents is a complex undertaking and an ongoing challenge in the field of AI. Specifically within the domain of Large Language Models (LLMs), the capability to consolidate multiple independently trained dialogue agents, each aligned with a distinct moral value, into a unified system that can adapt to and be aligned with multiple moral values is of paramount importance. In this paper, we propose a system that does contextual moral value alignment based on contextual aggregation. Here, aggregation is defined as the process of integrating a subset of LLM responses that are best suited to respond to a user input, taking into account features extracted from the user's input. The proposed system shows better results in term of alignment to human value compared to the state of the art.
Published: 2024

8. Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Author: Achintalwar, Swapnaja, Garcia, Adriana Alvarado, Anaby-Tavor, Ateret, Baldini, Ioana, Berger, Sara E., Bhattacharjee, Bishwaranjan, Bouneffouf, Djallel, Chaudhury, Subhajit, Chen, Pin-Yu, Chiazor, Lamogha, Daly, Elizabeth M., DB, Kirushikesh, de Paula, Rogério Abreu, Dognin, Pierre, Farchi, Eitan, Ghosh, Soumya, Hind, Michael, Horesh, Raya, Kour, George, Lee, Ja Young, Madaan, Nishtha, Mehta, Sameep, Miehling, Erik, Murugesan, Keerthiram, Nagireddy, Manish, Padhi, Inkit, Piorkowski, David, Rawat, Ambrish, Raz, Orna, Sattigeri, Prasanna, Strobelt, Hendrik, Swaminathan, Sarathkrishna, Tillmann, Christoph, Trivedi, Aashka, Varshney, Kush R., Wei, Dennis, Witherspooon, Shalisha, and Zalmanovici, Marcel
Subjects: Computer Science - Machine Learning
Abstract: Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we present our ongoing efforts to create and deploy a library of detectors: compact and easy-to-build classification models that provide labels for various harms. In addition to the detectors themselves, we discuss a wide range of uses for these detector models - from acting as guardrails to enabling effective AI governance. We also deep dive into inherent challenges in their development and discuss future work aimed at making the detectors more reliable and broadening their scope.
Published: 2024

9. Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Author: Achintalwar, Swapnaja, Baldini, Ioana, Bouneffouf, Djallel, Byamugisha, Joan, Chang, Maria, Dognin, Pierre, Farchi, Eitan, Makondo, Ndivhuwo, Mojsilovic, Aleksandra, Nagireddy, Manish, Ramamurthy, Karthikeyan Natesan, Padhi, Inkit, Raz, Orna, Rios, Jesus, Sattigeri, Prasanna, Singh, Moninder, Thwala, Siphiwe, Uceda-Sosa, Rosario A., and Varshney, Kush R.
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentially conflicting requirements in context. We lay out three main components of such an Alignment Studio architecture: Framers, Instructors, and Auditors that work in concert to control the behavior of a language model. We illustrate this approach with a running example of aligning a company's internal-facing enterprise chatbot to its business conduct guidelines., Comment: 7 pages, 5 figures
Published: 2024

10. COMPASS: Computational Mapping of Patient-Therapist Alliance Strategies with Language Modeling

Author: Lin, Baihan, Bouneffouf, Djallel, Landa, Yulia, Jespersen, Rachel, Corcoran, Cheryl, and Cecchi, Guillermo
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition
Abstract: The therapeutic working alliance is a critical factor in predicting the success of psychotherapy treatment. Traditionally, working alliance assessment relies on questionnaires completed by both therapists and patients. In this paper, we present COMPASS, a novel framework to directly infer the therapeutic working alliance from the natural language used in psychotherapy sessions. Our approach utilizes advanced large language models (LLMs) to analyze transcripts of psychotherapy sessions and compare them with distributed representations of statements in the working alliance inventory. Analyzing a dataset of over 950 sessions covering diverse psychiatric conditions including anxiety, depression, schizophrenia, and suicidal tendencies, we demonstrate the effectiveness of our method in providing fine-grained mapping of patient-therapist alignment trajectories and offering interpretability for clinical psychiatry and in identifying emerging patterns related to the condition being treated. By employing various deep learning-based topic modeling techniques in combination with prompting generative language models, we analyze the topical characteristics of different psychiatric conditions and their evolution at a turn-level resolution. This combined framework enhances the understanding of therapeutic interactions, enabling timely feedback for therapists regarding the quality of therapeutic relationships and providing interpretable insights to improve the effectiveness of psychotherapy., Comment: This work extends our research series in computational psychiatry (e.g auto annotation in arXiv:2204.05522, topic extraction in arXiv:2204.10189, and diagnosis in arXiv:2210.15603) with the introduction of LLMs to complete the full cycle of interpreting and understanding psychotherapy strategies as a comprehensive analytical framework
Published: 2024

11. Interpolating Item and User Fairness in Multi-Sided Recommendations

Author: Chen, Qinyi, Liang, Jason Cheuk Nam, Golrezaei, Negin, and Bouneffouf, Djallel
Subjects: Computer Science - Information Retrieval, Computer Science - Computers and Society, Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning
Abstract: Today's online platforms heavily lean on algorithmic recommendations for bolstering user engagement and driving revenue. However, these recommendations can impact multiple stakeholders simultaneously -- the platform, items (sellers), and users (customers) -- each with their unique objectives, making it difficult to find the right middle ground that accommodates all stakeholders. To address this, we introduce a novel fair recommendation framework, Problem (FAIR), that flexibly balances multi-stakeholder interests via a constrained optimization formulation. We next explore Problem (FAIR) in a dynamic online setting where data uncertainty further adds complexity, and propose a low-regret algorithm FORM that concurrently performs real-time learning and fair recommendations, two tasks that are often at odds. Via both theoretical analysis and a numerical case study on real-world data, we demonstrate the efficacy of our framework and method in maintaining platform revenue while ensuring desired levels of fairness for both items and users.
Published: 2023

12. Utterance Classification with Logical Neural Network: Explainable AI for Mental Disorder Diagnosis

Author: Toleubay, Yeldar, Agravante, Don Joven, Kimura, Daiki, Lin, Baihan, Bouneffouf, Djallel, and Tatsubori, Michiaki
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Logic in Computer Science, Quantitative Biology - Neurons and Cognition
Abstract: In response to the global challenge of mental health problems, we proposes a Logical Neural Network (LNN) based Neuro-Symbolic AI method for the diagnosis of mental disorders. Due to the lack of effective therapy coverage for mental disorders, there is a need for an AI solution that can assist therapists with the diagnosis. However, current Neural Network models lack explainability and may not be trusted by therapists. The LNN is a Recurrent Neural Network architecture that combines the learning capabilities of neural networks with the reasoning capabilities of classical logic-based AI. The proposed system uses input predicates from clinical interviews to output a mental disorder class, and different predicate pruning techniques are used to achieve scalability and higher scores. In addition, we provide an insight extraction method to aid therapists with their diagnosis. The proposed system addresses the lack of explainability of current Neural Network models and provides a more trustworthy solution for mental disorder diagnosis., Comment: ACL 2023
Published: 2023

13. Towards Healthy AI: Large Language Models Need Therapists Too

Author: Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo, and Varshney, Kush R.
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: Recent advances in large language models (LLMs) have led to the development of powerful AI chatbots capable of engaging in natural and human-like conversations. However, these chatbots can be potentially harmful, exhibiting manipulative, gaslighting, and narcissistic behaviors. We define Healthy AI to be safe, trustworthy and ethical. To create healthy AI systems, we present the SafeguardGPT framework that uses psychotherapy to correct for these harmful behaviors in AI chatbots. The framework involves four types of AI agents: a Chatbot, a "User," a "Therapist," and a "Critic." We demonstrate the effectiveness of SafeguardGPT through a working example of simulating a social conversation. Our results show that the framework can improve the quality of conversations between AI chatbots and humans. Although there are still several challenges and directions to be addressed in the future, SafeguardGPT provides a promising approach to improving the alignment between AI chatbots and human values. By incorporating psychotherapy and reinforcement learning techniques, the framework enables AI chatbots to learn and adapt to human preferences and values in a safe and ethical way, contributing to the development of a more human-centric and responsible AI.
Published: 2023

14. Psychotherapy AI Companion with Reinforcement Learning Recommendations and Interpretable Policy Dynamics

Author: Lin, Baihan, Cecchi, Guillermo, and Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Human-Computer Interaction, Quantitative Biology - Neurons and Cognition
Abstract: We introduce a Reinforcement Learning Psychotherapy AI Companion that generates topic recommendations for therapists based on patient responses. The system uses Deep Reinforcement Learning (DRL) to generate multi-objective policies for four different psychiatric conditions: anxiety, depression, schizophrenia, and suicidal cases. We present our experimental results on the accuracy of recommended topics using three different scales of working alliance ratings: task, bond, and goal. We show that the system is able to capture the real data (historical topics discussed by the therapists) relatively well, and that the best performing models vary by disorder and rating scale. To gain interpretable insights into the learned policies, we visualize policy trajectories in a 2D principal component analysis space and transition matrices. These visualizations reveal distinct patterns in the policies trained with different reward signals and trained on different clinical diagnoses. Our system's success in generating DIsorder-Specific Multi-Objective Policies (DISMOP) and interpretable policy dynamics demonstrates the potential of DRL in providing personalized and efficient therapeutic recommendations., Comment: WWW 2023. This work supersede our prior work arxiv:2208.13077 by studying the interpretability of RL-based therapy agents with policy visualizations
Published: 2023

15. TherapyView: Visualizing Therapy Sessions with Temporal Topic Modeling and AI-Generated Arts

Author: Lin, Baihan, Zecevic, Stefan, Bouneffouf, Djallel, and Cecchi, Guillermo
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: We present the TherapyView, a demonstration system to help therapists visualize the dynamic contents of past treatment sessions, enabled by the state-of-the-art neural topic modeling techniques to analyze the topical tendencies of various psychiatric conditions and deep learning-based image generation engine to provide a visual summary. The system incorporates temporal modeling to provide a time-series representation of topic similarities at a turn-level resolution and AI-generated artworks given the dialogue segments to provide a concise representations of the contents covered in the session, offering interpretable insights for therapists to optimize their strategies and enhance the effectiveness of psychotherapy. This system provides a proof of concept of AI-augmented therapy tools with e in-depth understanding of the patient's mental state and enabling more effective treatment., Comment: This work extends our prior empirical work on topic modeling (arxiv:2204.10189) to now provide an interpretable and interactive data visualization platform with AI-generated artworks as a concrete user scenario for therapists
Published: 2023

16. A Survey on Compositional Generalization in Applications

Author: Lin, Baihan, Bouneffouf, Djallel, and Rish, Irina
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Symbolic Computation
Abstract: The field of compositional generalization is currently experiencing a renaissance in AI, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical compositional generalization problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the compositional generalization. Specifically, we introduce a taxonomy of common applications and summarize the state-of-the-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this burgeoning field.
Published: 2023

17. Non-Stationary Bandits with Auto-Regressive Temporal Dependency

Author: Chen, Qinyi, Golrezaei, Negin, and Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms
Abstract: Traditional multi-armed bandit (MAB) frameworks, predominantly examined under stochastic or adversarial settings, often overlook the temporal dynamics inherent in many real-world applications such as recommendation systems and online advertising. This paper introduces a novel non-stationary MAB framework that captures the temporal structure of these real-world dynamics through an auto-regressive (AR) reward structure. We propose an algorithm that integrates two key mechanisms: (i) an alternation mechanism adept at leveraging temporal dependencies to dynamically balance exploration and exploitation, and (ii) a restarting mechanism designed to discard out-of-date information. Our algorithm achieves a regret upper bound that nearly matches the lower bound, with regret measured against a robust dynamic benchmark. Finally, via a real-world case study on tourism demand prediction, we demonstrate both the efficacy of our algorithm and the broader applicability of our techniques to more complex, rapidly evolving time series., Comment: 45 pages, 8 figures
Published: 2022

18. Working Alliance Transformer for Psychotherapy Dialogue Classification

Author: Lin, Baihan, Cecchi, Guillermo, and Bouneffouf, Djallel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition
Abstract: As a predictive measure of the treatment outcome in psychotherapy, the working alliance measures the agreement of the patient and the therapist in terms of their bond, task and goal. Long been a clinical quantity estimated by the patients' and therapists' self-evaluative reports, we believe that the working alliance can be better characterized using natural language processing technique directly in the dialogue transcribed in each therapy session. In this work, we propose the Working Alliance Transformer (WAT), a Transformer-based classification model that has a psychological state encoder which infers the working alliance scores by projecting the embedding of the dialogues turns onto the embedding space of the clinical inventory for working alliance. We evaluate our method in a real-world dataset with over 950 therapy sessions with anxiety, depression, schizophrenia and suicidal patients and demonstrate an empirical advantage of using information about the therapeutic states in this sequence classification task of psychotherapy dialogues.
Published: 2022

19. Survey on Applications of Neurosymbolic Artificial Intelligence

Author: Bouneffouf, Djallel and Aggarwal, Charu C.
Subjects: Computer Science - Artificial Intelligence, Computer Science - Symbolic Computation
Abstract: In recent years, the Neurosymbolic framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance. This success is due to its stellar performance combined with attractive properties, such as learning and reasoning. The new emerging Neurosymbolic field is currently experiencing a renaissance, as novel frameworks and algorithms motivated by various practical applications are being introduced, building on top of the classical neural and reasoning problem setting. This article aims to provide a comprehensive review of significant recent developments in real-world applications of Neurosymbolic Artificial Intelligence. Specifically, we introduce a taxonomy of common Neurosymbolic applications and summarize the state-of-the-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this burgeoning field.
Published: 2022

20. SupervisorBot: NLP-Annotated Real-Time Recommendations of Psychotherapy Treatment Strategies with Deep Reinforcement Learning

Author: Lin, Baihan, Cecchi, Guillermo, and Bouneffouf, Djallel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition
Abstract: We propose a recommendation system that suggests treatment strategies to a therapist during the psychotherapy session in real-time. Our system uses a turn-level rating mechanism that predicts the therapeutic outcome by computing a similarity score between the deep embedding of a scoring inventory, and the current sentence that the patient is speaking. The system automatically transcribes a continuous audio stream and separates it into turns of the patient and of the therapist and perform real-time inference of their therapeutic working alliance. The dialogue pairs along with their computed working alliance as ratings are then fed into a deep reinforcement learning recommendation system where the sessions are treated as users and the topics are treated as items. Other than evaluating the empirical advantages of the core components on an existing dataset of psychotherapy sessions, we demonstrate the effectiveness of this system in a web app., Comment: This work extends our work series in interactive speech or text systems for psychotherapy (e.g. arXiv:2006.04376, arXiv:2204.05522 and arXiv:2204.10189) and proposes a novel recommendation setting
Published: 2022

21. Neural Topic Modeling of Psychotherapy Sessions

Author: Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo, and Tejwani, Ravi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition
Abstract: In this work, we compare different neural topic modeling methods in learning the topical propensities of different psychiatric conditions from the psychotherapy session transcripts parsed from speech recordings. We also incorporate temporal modeling to put this additional interpretability to action by parsing out topic similarities as a time series in a turn-level resolution. We believe this topic modeling framework can offer interpretable insights for the therapist to optimally decide his or her strategy and improve psychotherapy effectiveness., Comment: This work extends our research series in computational linguistics for psychiatry (e.g. working alliance analysis in arXiv:2204.05522) with a systematic investigation of neural topic modeling approaches to provide interpretable insights in psychotherapy
Published: 2022

22. Deep Annotation of Therapeutic Working Alliance in Psychotherapy

Author: Lin, Baihan, Cecchi, Guillermo, and Bouneffouf, Djallel
Subjects: Quantitative Biology - Neurons and Cognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: The therapeutic working alliance is an important predictor of the outcome of the psychotherapy treatment. In practice, the working alliance is estimated from a set of scoring questionnaires in an inventory that both the patient and the therapists fill out. In this work, we propose an analytical framework of directly inferring the therapeutic working alliance from the natural language within the psychotherapy sessions in a turn-level resolution with deep embeddings such as the Doc2Vec and SentenceBERT models. The transcript of each psychotherapy session can be transcribed and generated in real-time from the session speech recordings, and these embedded dialogues are compared with the distributed representations of the statements in the working alliance inventory. We demonstrate, in a real-world dataset with over 950 sessions of psychotherapy treatments in anxiety, depression, schizophrenia and suicidal patients, the effectiveness of this method in mapping out trajectories of patient-therapist alignment and the interpretability that can offer insights in clinical psychiatry. We believe such a framework can be provide timely feedback to the therapist regarding the quality of the conversation in interview sessions.
Published: 2022

23. Optimal Epidemic Control as a Contextual Combinatorial Bandit with Budget

Author: Lin, Baihan and Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: In light of the COVID-19 pandemic, it is an open challenge and critical practical problem to find a optimal way to dynamically prescribe the best policies that balance both the governmental resources and epidemic control in different countries and regions. To solve this multi-dimensional tradeoff of exploitation and exploration, we formulate this technical challenge as a contextual combinatorial bandit problem that jointly optimizes a multi-criteria reward function. Given the historical daily cases in a region and the past intervention plans in place, the agent should generate useful intervention plans that policy makers can implement in real time to minimizing both the number of daily COVID-19 cases and the stringency of the recommended interventions. We prove this concept with simulations of multiple realistic policy making scenarios and demonstrate a clear advantage in providing a pareto optimal solution in the epidemic intervention problem., Comment: Proceeding of FUZZ-IEEE 2022. This work extends our prior work on real-world applications of budgeted bandits (e.g. arXiv:1906.09384), and aims to solve the critical problem of epidemic control. Codes at: https://github.com/doerlbh/BanditZoo
Published: 2021

24. Reinforcement Learning with Algorithms from Probabilistic Structure Estimation

Author: Epperlein, Jonathan P., Overko, Roman, Zhuk, Sergiy, King, Christopher, Bouneffouf, Djallel, Cullen, Andrew, and Shorten, Robert
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement learning (RL) algorithms aim to learn optimal decisions in unknown environments through experience of taking actions and observing the rewards gained. In some cases, the environment is not influenced by the actions of the RL agent, in which case the problem can be modeled as a contextual multi-armed bandit and lightweight myopic algorithms can be employed. On the other hand, when the RL agent's actions affect the environment, the problem must be modeled as a Markov decision process and more complex RL algorithms are required which take the future effects of actions into account. Moreover, in practice, it is often unknown from the outset whether or not the agent's actions will impact the environment and it is therefore not possible to determine which RL algorithm is most fitting. In this work, we propose to avoid this difficult decision entirely and incorporate a choice mechanism into our RL framework. Rather than assuming a specific problem structure, we use a probabilistic structure estimation procedure based on a likelihood-ratio (LR) test to make a more informed selection of learning algorithm. We derive a sufficient condition under which myopic policies are optimal, present an LR test for this condition, and derive a bound on the regret of our framework. We provide examples of real-world scenarios where our framework is needed and provide extensive simulations to validate our approach.
Published: 2021

25. Etat de l'art sur l'application des bandits multi-bras

Author: Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The Multi-armed bandit offer the advantage to learn and exploit the already learnt knowledge at the same time. This capability allows this approach to be applied in different domains, going from clinical trials where the goal is investigating the effects of different experimental treatments while minimizing patient losses, to adaptive routing where the goal is to minimize the delays in a network. This article provides a review of the recent results on applying bandit to real-life scenario and summarize the state of the art for each of these fields. Different techniques has been proposed to solve this problem setting, like epsilon-greedy, Upper confident bound (UCB) and Thompson Sampling (TS). We are showing here how this algorithms were adapted to solve the different problems of exploration exploitation., Comment: in French
Published: 2021

26. Predicting human decision making in psychological tasks with recurrent neural networks

Author: Lin, Baihan, Bouneffouf, Djallel, and Cecchi, Guillermo
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Quantitative Biology - Neurons and Cognition
Abstract: Unlike traditional time series, the action sequences of human decision making usually involve many cognitive processes such as beliefs, desires, intentions, and theory of mind, i.e., what others are thinking. This makes predicting human decision-making challenging to be treated agnostically to the underlying psychological mechanisms. We propose here to use a recurrent neural network architecture based on long short-term memory networks (LSTM) to predict the time series of the actions taken by human subjects engaged in gaming activity, the first application of such methods in this research domain. In this study, we collate the human data from 8 published literature of the Iterated Prisoner's Dilemma comprising 168,386 individual decisions and post-process them into 8,257 behavioral trajectories of 9 actions each for both players. Similarly, we collate 617 trajectories of 95 actions from 10 different published studies of Iowa Gambling Task experiments with healthy human subjects. We train our prediction networks on the behavioral data and demonstrate a clear advantage over the state-of-the-art methods in predicting human decision-making trajectories in both the single-agent scenario of the Iowa Gambling Task and the multi-agent scenario of the Iterated Prisoner's Dilemma. Moreover, we observe that the weights of the LSTM networks modeling the top performers tend to have a wider distribution compared to poor performers, as well as a larger bias, which suggest possible interpretations for the distribution of strategies adopted by each group., Comment: To appear in PLOS ONE. Codes at https://github.com/doerlbh/HumanLSTM
Published: 2020
Full Text: View/download PDF

27. Double-Linear Thompson Sampling for Context-Attentive Bandits

Author: Bouneffouf, Djallel, Féraud, Raphaël, Upadhyay, Sohini, Khazaeni, Yasaman, and Rish, Irina
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a freedom to choose which variables to observe. We derive a novel algorithm, called Context-Attentive Thompson Sampling (CATS), which builds upon the Linear Thompson Sampling approach, adapting it to Context-Attentive Bandit setting. We provide a theoretical regret analysis and an extensive empirical evaluation demonstrating advantages of the proposed approach over several baseline methods on a variety of real-life datasets, Comment: arXiv admin note: text overlap with arXiv:1906.09384
Published: 2020

28. Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

Author: Zhao, Pu, Ram, Parikshit, Lu, Songtao, Yao, Yuguang, Bouneffouf, Djallel, Lin, Xue, and Liu, Sijia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Adversarial perturbations are critical for certifying the robustness of deep learning models. A universal adversarial perturbation (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image resolutions). Towards an authentic universality across image sources, we take a novel view of UAP generation as a customized instance of few-shot learning, which leverages bilevel optimization and learning-to-optimize (L2O) techniques for UAP generation with improved attack success rate (ASR). We begin by considering the popular model agnostic meta-learning (MAML) framework to meta-learn a UAP generator. However, we see that the MAML framework does not directly offer the universal attack across image sources, requiring us to integrate it with another meta-learning framework of L2O. The resulting scheme for meta-learning a UAP generator (i) has better performance (50% higher ASR) than baselines such as Projected Gradient Descent, (ii) has better performance (37% faster) than the vanilla L2O and MAML frameworks (when applicable), and (iii) is able to simultaneously handle UAP generation for different victim models and image data sources.
Published: 2020

29. Spectral Clustering using Eigenspectrum Shape Based Nystrom Sampling

Author: Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Spectral clustering has shown a superior performance in analyzing the cluster structure. However, its computational complexity limits its application in analyzing large-scale data. To address this problem, many low-rank matrix approximating algorithms are proposed, including the Nystrom method - an approach with proven approximate error bounds. There are several algorithms that provide recipes to construct Nystrom approximations with variable accuracies and computing times. This paper proposes a scalable Nystrom-based clustering algorithm with a new sampling procedure, Centroid Minimum Sum of Squared Similarities (CMS3), and a heuristic on when to use it. Our heuristic depends on the eigen spectrum shape of the dataset, and yields competitive low-rank approximations in test datasets compared to the other state-of-the-art methods
Published: 2020

30. Computing the Dirichlet-Multinomial Log-Likelihood Function

Author: Bouneffouf, Djallel
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Dirichlet-multinomial (DMN) distribution is commonly used to model over-dispersion in count data. Precise and fast numerical computation of the DMN log-likelihood function is important for performing statistical inference using this distribution, and remains a challenge. To address this, we use mathematical properties of the gamma function to derive a closed form expression for the DMN log-likelihood function. Compared to existing methods, calculation of the closed form has a lower computational complexity, hence is much faster without comprimising computational accuracy.
Published: 2020

31. Contextual Bandit with Missing Rewards

Author: Bouneffouf, Djallel, Upadhyay, Sohini, and Khazaeni, Yasaman
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the reward associated with each context-based decision may not always be observed("missing rewards"). This new problem is motivated by certain online settings including clinical trial and ad recommendation applications. In order to address the missing rewards setting, we propose to combine the standard contextual bandit approach with an unsupervised learning mechanism such as clustering. Unlike standard contextual bandit methods, by leveraging clustering to estimate missing reward, we are able to learn from each incoming event, even those with missing rewards. Promising empirical results are obtained on several real-life datasets.
Published: 2020

32. Online learning with Corrupted context: Corrupted Contextual Bandits

Author: Bouneffouf, Djallel
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may be corrupted ("useless context"). This new problem is motivated by certain on-line settings including clinical trial and ad recommendation applications. In order to address the corrupted-context setting,we propose to combine the standard contextual bandit approach with a classical multi-armed bandit mechanism. Unlike standard contextual bandit methods, we are able to learn from all iteration, even those with corrupted context, by improving the computing of the expectation for each arm. Promising empirical results are obtained on several real-life datasets.
Published: 2020

33. Solving Constrained CASH Problems with ADMM

Author: Ram, Parikshit, Liu, Sijia, Vijaykeerthi, Deepak, Wang, Dakuo, Bouneffouf, Djallel, Bramble, Greg, Samulowitz, Horst, and Gray, Alexander G.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization framework to decompose CASH into multiple small problems and demonstrate how ADMM facilitates incorporation of black-box constraints., Comment: 7th ICML Workshop on Automated Machine Learning (2020)
Published: 2020

34. Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

Author: Lin, Baihan, Bouneffouf, Djallel, and Cecchi, Guillermo
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Multiagent Systems, Quantitative Biology - Neurons and Cognition
Abstract: As an important psychological and social experiment, the Iterated Prisoner's Dilemma (IPD) treats the choice to cooperate or defect as an atomic action. We propose to study the behaviors of online learning algorithms in the Iterated Prisoner's Dilemma (IPD) game, where we investigate the full spectrum of reinforcement learning agents: multi-armed bandits, contextual bandits and reinforcement learning. We evaluate them based on a tournament of iterated prisoner's dilemma where multiple agents can compete in a sequential fashion. This allows us to analyze the dynamics of policies learned by multiple self-interested independent reward-driven agents, and also allows us study the capacity of these algorithms to fit the human behaviors. Results suggest that considering the current situation to make decision is the worst in this kind of social dilemma game. Multiples discoveries on online learning behaviors and clinical validations are stated, as an effort to connect artificial intelligence algorithms with human behaviors and their abnormal states in neuropsychiatric conditions., Comment: Proceeding of PRICAI 2022. To the best of our knowledge, this is the first attempt to explore the full spectrum of reinforcement learning agents (multi-armed bandits, contextual bandits and reinforcement learning) in the sequential social dilemma. Codes at https://github.com/doerlbh/dilemmaRL
Published: 2020

35. Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

Author: Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Reinen, Jenna, and Rish, Irina
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition, Statistics - Machine Learning
Abstract: Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an environment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting., Comment: Proceeding of HBAI 2020. This article supersedes and extends our work arXiv:1706.02897 (MAB) and arXiv:1906.11286 (RL) into the Contextual Bandit (CB) framework. It generalized extensively into multi-armed bandits, contextual bandits and RL settings to create a unified framework of human behavioral agents
Published: 2020

36. Hyper-parameter Tuning for the Contextual Bandit

Author: Bouneffouf, Djallel and Claeys, Emmanuelle
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a parameter that is tuned by the user. However, our proposed algorithm learn to choose the right exploration parameters in an online manner based on the observed context, and the immediate reward received for the chosen action. We have presented here two algorithms that uses a bandit to find the optimal exploration of the contextual bandit algorithm, which we hope is the first step toward the automation of the multi-armed bandit algorithm., Comment: arXiv admin note: text overlap with arXiv:1705.03821
Published: 2020

37. Neural Topic Modeling of Psychotherapy Sessions

Author: Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo, Tejwani, Ravi, Kacprzyk, Janusz, Series Editor, Shaban-Nejad, Arash, editor, Michalowski, Martin, editor, and Bianco, Simone, editor
Published: 2023
Full Text: View/download PDF

38. Deep Annotation of Therapeutic Working Alliance in Psychotherapy

Author: Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Kacprzyk, Janusz, Series Editor, Shaban-Nejad, Arash, editor, Michalowski, Martin, editor, and Bianco, Simone, editor
Published: 2023
Full Text: View/download PDF

39. How can AI Automate End-to-End Data Science?

Author: Aggarwal, Charu, Bouneffouf, Djallel, Samulowitz, Horst, Buesser, Beat, Hoang, Thanh, Khurana, Udayan, Liu, Sijia, Pedapati, Tejaswini, Ram, Parikshit, Rawat, Ambrish, Wistuba, Martin, and Gray, Alexander
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emerging as an important research and business topic. We introduce and define the AutoDS challenge, followed by a proposal of a general AutoDS framework that covers existing approaches but also provides guidance for the development of new methods. We categorize and review the existing literature from multiple aspects of the problem setup and employed techniques. Then we provide several views on how AI could succeed in automating end-to-end AutoDS. We hope this survey can serve as insightful guideline for the AutoDS field and provide inspiration for future research.
Published: 2019

40. Split Q Learning: Reinforcement Learning with Two-Stream Rewards

Author: Lin, Baihan, Bouneffouf, Djallel, and Cecchi, Guillermo
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems, Quantitative Biology - Neurons and Cognition, Statistics - Machine Learning
Abstract: Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems., Comment: IJCAI 2019. This article supersedes our work arXiv:1706.02897 into RL setting, with a different focus by applying Inverse Reinforcement Learning to model human clinical behavioral bias. It also precedes our work arXiv:1906.11286 which introduces extensive emphases in RL games
Published: 2019

41. A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry

Author: Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Reinen, Jenna, and Rish, Irina
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems, Quantitative Biology - Neurons and Cognition, Statistics - Machine Learning
Abstract: Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework for reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help us better understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems, as well as various neuropsychiatric conditions associated with disruptions in normal reward processing. From the computational perspective, we observe that the proposed Split-QL model and its clinically inspired variants consistently outperform standard Q-Learning and SARSA methods, as well as recently proposed Double Q-Learning approaches, on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the Pac-Man game in a lifelong learning setting across different reward stationarities., Comment: Published in AAMAS 2020 as a full paper. This article supersedes our work arXiv:1706.02897 into RL setting and extends extensively into RL games, cognitive modeling, and gambling tasks in lifelong learning setting
Published: 2019

42. Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit

Author: Bouneffouf, Djallel, Parthasarathy, Srinivasan, Samulowitz, Horst, and Wistub, Martin
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider the stochastic multi-armed bandit problem and the contextual bandit problem with historical observations and pre-clustered arms. The historical observations can contain any number of instances for each arm, and the pre-clustering information is a fixed clustering of arms provided as part of the input. We develop a variety of algorithms which incorporate this offline information effectively during the online exploration phase and derive their regret bounds. In particular, we develop the META algorithm which effectively hedges between two other algorithms: one which uses both historical observations and clustering, and another which uses only the historical observations. The former outperforms the latter when the clustering quality is good, and vice-versa. Extensive experiments on synthetic and real world datasets on Warafin drug dosage and web server selection for latency minimization validate our theoretical insights and demonstrate that META is a robust strategy for optimally exploiting the pre-clustering information., Comment: IJCAI 2019, International Joint Conferences on Artificial Intelligence
Published: 2019

43. An ADMM Based Framework for AutoML Pipeline Configuration

Author: Liu, Sijia, Ram, Parikshit, Vijaykeerthy, Deepak, Bouneffouf, Djallel, Bramble, Gregory, Samulowitz, Horst, Wang, Dakuo, Conn, Andrew, and Gray, Alexander
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the AutoML problem of automatically configuring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization with mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of multipliers (ADMM). The proposed framework is able to (i) decompose the optimization problem into easier sub-problems that have a reduced number of variables and circumvent the challenge of mixed variable categories, and (ii) incorporate black-box constraints along-side the black-box optimization objective. We empirically evaluate the flexibility (in utilizing existing AutoML techniques), effectiveness (against open source AutoML toolkits),and unique capability (of executing AutoML with practically motivated black-box constraints) of our proposed scheme on a collection of binary classification data sets from UCI ML& OpenML repositories. We observe that on an average our framework provides significant gains in comparison to other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical advantages of this framework.
Published: 2019

44. A Survey on Practical Applications of Multi-Armed and Contextual Bandits

Author: Bouneffouf, Djallel and Rish, Irina
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In recent years, multi-armed bandit (MAB) framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance, due to its stellar performance combined with certain attractive properties, such as learning from less feedback. The multi-armed bandit field is currently flourishing, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical bandit problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the multi-armed bandit. Specifically, we introduce a taxonomy of common MAB-based applications and summarize state-of-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this exciting and fast-growing field., Comment: under review by IJCAI 2019 Survey
Published: 2019

45. Online Learning in Iterated Prisoner’s Dilemma to Mimic Human Behavior

Author: Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Khanna, Sankalp, editor, Cao, Jian, editor, Bai, Quan, editor, and Xu, Guandong, editor
Published: 2022
Full Text: View/download PDF

46. Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Author: Noothigattu, Ritesh, Bouneffouf, Djallel, Mattei, Nicholas, Chandra, Rachita, Madan, Piyush, Varshney, Kush, Campbell, Murray, Singh, Moninder, and Rossi, Francesca
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come from any number of sources including regulations, business process guidelines, laws, ethical principles, social norms, and moral values. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations of the task, and reinforcement learning to learn to maximize the environment rewards. More precisely, we assume that an agent can observe traces of behavior of members of the society but has no access to the explicit set of constraints that give rise to the observed behavior. Inverse reinforcement learning is used to learn such constraints, that are then combined with a possibly orthogonal value function through the use of a contextual bandit-based orchestrator that picks a contextually-appropriate choice between the two policies (constraint-based and environment reward-based) when taking actions. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using a Pac-Man domain and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways., Comment: 8 pages, 3 figures
Published: 2018

47. Incorporating Behavioral Constraints in Online AI Systems

Author: Balakrishnan, Avinash, Bouneffouf, Djallel, Mattei, Nicholas, and Rossi, Francesca
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints as a guide when making decisions in an online setting while still being reactive to reward feedback. To define this agent, we propose to adopt a novel extension to the classical contextual multi-armed bandit setting and we provide a new algorithm called Behavior Constrained Thompson Sampling (BCTS) that allows for online learning while obeying exogenous constraints. Our agent learns a constrained policy that implements the observed behavioral constraints demonstrated by a teacher agent, and then uses this constrained policy to guide the reward-based online exploration and exploitation. We characterize the upper bound on the expected regret of the contextual bandit algorithm that underlies our agent and provide a case study with real world data in two application domains. Our experiments show that the designed agent is able to act within the set of behavior constraints without significantly degrading its overall reward performance., Comment: 9 pages, 6 figures
Published: 2018

48. Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

Author: Choromanska, Anna, Cowen, Benjamin, Kumaravel, Sadhana, Luss, Ronny, Rigotti, Mattia, Rish, Irina, Kingsbury, Brian, DiAchille, Paolo, Gurev, Viatcheslav, Tejwani, Ravi, and Bouneffouf, Djallel
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together with the first theoretical convergence guarantees for AM in stochastic settings and promising empirical results on a variety of architectures and datasets., Comment: First six authors contributed equally to this work: A.C. - theory, manuscript, B.C. - code, experiments, S.K. - code, experiments, R.L. - algorithm, experiments, M.R. - code, experiments, I.R. - algorithm, manuscript
Published: 2018

49. Reinforcement learning with algorithms from probabilistic structure estimation

Author: Epperlein, Jonathan P., Overko, Roman, Zhuk, Sergiy, King, Christopher, Bouneffouf, Djallel, Cullen, Andrew, and Shorten, Robert
Published: 2022
Full Text: View/download PDF

50. Contextual Bandit with Adaptive Feature Extraction

Author: Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo, and Rish, Irina
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider an online decision making setting known as contextual bandit problem, and propose an approach for improving contextual bandit performance by using an adaptive feature extraction (representation learning) based on online clustering. Our approach starts with an off-line pre-training on unlabeled history of contexts (which can be exploited by our approach, but not by the standard contextual bandit), followed by an online selection and adaptation of encoders. Specifically, given an input sample (context), the proposed approach selects the most appropriate encoding function to extract a feature vector which becomes an input for a contextual bandit, and updates both the bandit and the encoding function based on the context and on the feedback (reward). Our experiments on a variety of datasets, and both in stationary and non-stationary environments of several kinds demonstrate clear advantages of the proposed adaptive representation learning over the standard contextual bandit based on "raw" input contexts., Comment: IEEE ICDMW 2018
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

284 results on '"Bouneffouf, Djallel"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources