Author: "Sontakke, Sumedh A" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sontakke, Sumedh A"' showing total 17 results

Start Over Author "Sontakke, Sumedh A"

17 results on '"Sontakke, Sumedh A"'

1. Value Explicit Pretraining for Learning Transferable Representations

Author: Lekkala, Kiran, Bao, Henghui, Sontakke, Sumedh, and Itti, Laurent
Subjects: Computer Science - Machine Learning, Computer Science - Robotics
Abstract: We propose Value Explicit Pretraining (VEP), a method that learns generalizable representations for transfer reinforcement learning. VEP enables learning of new tasks that share similar objectives as previously learned tasks, by learning an encoder for objective-conditioned representations, irrespective of appearance changes and environment dynamics. To pre-train the encoder from a sequence of observations, we use a self-supervised contrastive loss that results in learning temporally smooth representations. VEP learns to relate states across different tasks based on the Bellman return estimate that is reflective of task progress. Experiments using a realistic navigation simulator and Atari benchmark show that the pretrained encoder produced by our method outperforms current SoTA pretraining methods on the ability to generalize to unseen tasks. VEP achieves up to a 2 times improvement in rewards on Atari and visual navigation, and up to a 3 times improvement in sample efficiency. For videos of policy performance visit our https://sites.google.com/view/value-explicit-pretraining/, Comment: Accepted at CoRL 2023 Workshop on PRL, Under Review at ICML 2024
Published: 2023

2. RoboCLIP: One Demonstration is Enough to Learn Robot Policies

Author: Sontakke, Sumedh A, Zhang, Jesse, Arnold, Sébastien M. R., Pertsch, Karl, Bıyık, Erdem, Sadigh, Dorsa, Finn, Chelsea, and Itti, Laurent
Subjects: Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Reward specification is a notoriously difficult problem in reinforcement learning, requiring extensive expert supervision to design robust reward functions. Imitation learning (IL) methods attempt to circumvent these problems by utilizing expert demonstrations but typically require a large number of in-domain expert demonstrations. Inspired by advances in the field of Video-and-Language Models (VLMs), we present RoboCLIP, an online imitation learning method that uses a single demonstration (overcoming the large data requirement) in the form of a video demonstration or a textual description of the task to generate rewards without manual reward function design. Additionally, RoboCLIP can also utilize out-of-domain demonstrations, like videos of humans solving the task for reward generation, circumventing the need to have the same demonstration and deployment domains. RoboCLIP utilizes pretrained VLMs without any finetuning for reward generation. Reinforcement learning agents trained with RoboCLIP rewards demonstrate 2-3 times higher zero-shot performance than competing imitation learning methods on downstream robot manipulation tasks, doing so using only one video/text demonstration.
Published: 2023

3. Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Author: Chebotar, Yevgen, Vuong, Quan, Irpan, Alex, Hausman, Karol, Xia, Fei, Lu, Yao, Kumar, Aviral, Yu, Tianhe, Herzog, Alexander, Pertsch, Karl, Gopalakrishnan, Keerthana, Ibarz, Julian, Nachum, Ofir, Sontakke, Sumedh, Salazar, Grecia, Tran, Huong T, Peralta, Jodilyn, Tan, Clayton, Manjunath, Deeksha, Singht, Jaspiar, Zitkovich, Brianna, Jackson, Tomas, Rao, Kanishka, Finn, Chelsea, and Levine, Sergey
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite. The project's website and videos can be found at https://qtransformer.github.io, Comment: See website at https://qtransformer.github.io
Published: 2023

4. Lightweight Learner for Shared Knowledge Lifelong Learning

Author: Ge, Yunhao, Li, Yuecheng, Wu, Di, Xu, Ao, Jones, Adam M., Rios, Amanda Sofie, Fostiropoulos, Iordanis, Wen, Shixian, Huang, Po-Hsuan, Murdock, Zachary William, Sahin, Gozde, Ni, Shuo, Lekkala, Kiran, Sontakke, Sumedh Anand, and Itti, Laurent
Subjects: Computer Science - Machine Learning
Abstract: In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated LL machinery is then deployed to mitigate the forgetting of old tasks as new tasks are learned. This is inherently slow. We propose a new Shared Knowledge Lifelong Learning (SKILL) challenge, which deploys a decentralized population of LL agents that each sequentially learn different tasks, with all agents operating independently and in parallel. After learning their respective tasks, agents share and consolidate their knowledge over a decentralized communication network, so that, in the end, all agents can master all tasks. We present one solution to SKILL which uses Lightweight Lifelong Learning (LLL) agents, where the goal is to facilitate efficient sharing by minimizing the fraction of the agent that is specialized for any given task. Each LLL agent thus consists of a common task-agnostic immutable part, where most parameters are, and individual task-specific modules that contain fewer parameters but are adapted to each task. Agents share their task-specific modules, plus summary information ("task anchors") representing their tasks in the common task-agnostic latent space of all agents. Receiving agents register each received task-specific module using the corresponding anchor. Thus, every agent improves its ability to solve new tasks each time new task-specific modules and anchors are received. On a new, very challenging SKILL-102 dataset with 102 image classification tasks (5,033 classes in total, 2,041,225 training, 243,464 validation, and 243,464 test images), we achieve much higher (and SOTA) accuracy over 8 LL baselines, while also achieving near perfect parallelization. Code and data can be found at https://github.com/gyhandy/Shared-Knowledge-Lifelong-Learning, Comment: Transactions on Machine Learning Research (TMLR) paper
Published: 2023

5. RT-1: Robotics Transformer for Real-World Control at Scale

Author: Brohan, Anthony, Brown, Noah, Carbajal, Justice, Chebotar, Yevgen, Dabis, Joseph, Finn, Chelsea, Gopalakrishnan, Keerthana, Hausman, Karol, Herzog, Alex, Hsu, Jasmine, Ibarz, Julian, Ichter, Brian, Irpan, Alex, Jackson, Tomas, Jesmonth, Sally, Joshi, Nikhil J, Julian, Ryan, Kalashnikov, Dmitry, Kuang, Yuheng, Leal, Isabel, Lee, Kuang-Huei, Levine, Sergey, Lu, Yao, Malla, Utsav, Manjunath, Deeksha, Mordatch, Igor, Nachum, Ofir, Parada, Carolina, Peralta, Jodilyn, Perez, Emily, Pertsch, Karl, Quiambao, Jornell, Rao, Kanishka, Ryoo, Michael, Salazar, Grecia, Sanketi, Pannag, Sayed, Kevin, Singh, Jaspiar, Sontakke, Sumedh, Stone, Austin, Tan, Clayton, Tran, Huong, Vanhoucke, Vincent, Vega, Steve, Vuong, Quan, Xia, Fei, Xiao, Ted, Xu, Peng, Xu, Sichun, Yu, Tianhe, and Zitkovich, Brianna
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer1.github.io, Comment: See website at robotics-transformer1.github.io
Published: 2022

6. Model2Detector: Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps

Author: Sontakke, Sumedh A, Ramanan, Buvaneswari, Itti, Laurent, and Woo, Thomas
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Out-of-distribution detection is an important capability that has long eluded vanilla neural networks. Deep Neural networks (DNNs) tend to generate over-confident predictions when presented with inputs that are significantly out-of-distribution (OOD). This can be dangerous when employing machine learning systems in the wild as detecting attacks can thus be difficult. Recent advances inference-time out-of-distribution detection help mitigate some of these problems. However, existing methods can be restrictive as they are often computationally expensive. Additionally, these methods require training of a downstream detector model which learns to detect OOD inputs from in-distribution ones. This, therefore, adds latency during inference. Here, we offer an information theoretic perspective on why neural networks are inherently incapable of OOD detection. We attempt to mitigate these flaws by converting a trained model into a an OOD detector using a handful of steps of gradient descent. Our work can be employed as a post-processing method whereby an inference-time ML system can convert a trained model into an OOD detector. Experimentally, we show how our method consistently outperforms the state-of-the-art in detection accuracy on popular image datasets while also reducing computational complexity., Comment: arXiv admin note: text overlap with arXiv:1807.03888, arXiv:1812.04606 by other authors
Published: 2022

7. GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

Author: Sontakke, Sumedh A, Iota, Stephen, Hu, Zizhao, Mehrjou, Arash, Itti, Laurent, and Schölkopf, Bernhard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Out-of-distribution (OOD) detection is a well-studied topic in supervised learning. Extending the successes in supervised learning methods to the reinforcement learning (RL) setting, however, is difficult due to the data generating process - RL agents actively query their environment for data, and the data are a function of the policy followed by the agent. An agent could thus neglect a shift in the environment if its policy did not lead it to explore the aspect of the environment that shifted. Therefore, to achieve safe and robust generalization in RL, there exists an unmet need for OOD detection through active experimentation. Here, we attempt to bridge this lacuna by first defining a causal framework for OOD scenarios or environments encountered by RL agents in the wild. Then, we propose a novel task: that of Out-of-Task Distribution (OOTD) detection. We introduce an RL agent that actively experiments in a test environment and subsequently concludes whether it is OOTD or not. We name our method GalilAI, in honor of Galileo Galilei, as it discovers, among other causal processes, that gravitational acceleration is independent of the mass of a body. Finally, we propose a simple probabilistic neural network baseline for comparison, which extends extant Model-Based RL. We find that GalilAI outperforms the baseline significantly. See visualizations of our method https://galil-ai.github.io/
Published: 2021

8. Video2Skill: Adapting Events in Demonstration Videos to Skills in an Environment using Cyclic MDP Homomorphisms

Author: Sontakke, Sumedh A, Roychowdhury, Sumegh, Sarkar, Mausoom, Puri, Nikaash, Krishnamurthy, Balaji, and Itti, Laurent
Subjects: Computer Science - Artificial Intelligence
Abstract: Humans excel at learning long-horizon tasks from demonstrations augmented with textual commentary, as evidenced by the burgeoning popularity of tutorial videos online. Intuitively, this capability can be separated into 2 distinct subtasks - first, dividing a long-horizon demonstration sequence into semantically meaningful events; second, adapting such events into meaningful behaviors in one's own environment. Here, we present Video2Skill (V2S), which attempts to extend this capability to artificial agents by allowing a robot arm to learn from human cooking videos. We first use sequence-to-sequence Auto-Encoder style architectures to learn a temporal latent space for events in long-horizon demonstrations. We then transfer these representations to the robotic target domain, using a small amount of offline and unrelated interaction data (sequences of state-action pairs of the robot arm controlled by an expert) to adapt these events into actionable representations, i.e., skills. Through experiments, we demonstrate that our approach results in self-supervised analogy learning, where the agent learns to draw analogies between motions in human demonstration data and behaviors in the robotic environment. We also demonstrate the efficacy of our approach on model learning - demonstrating how Video2Skill utilizes prior knowledge from human demonstration to outperform traditional model learning of long-horizon dynamics. Finally, we demonstrate the utility of our approach for non-tabula rasa decision-making, i.e, utilizing video demonstration for zero-shot skill generation.
Published: 2021

9. SHERLock: Self-Supervised Hierarchical Event Representation Learning

Author: Roychowdhury, Sumegh, Sontakke, Sumedh A., Puri, Nikaash, Sarkar, Mausoom, Aggarwal, Milan, Badjatiya, Pinkesh, Krishnamurthy, Balaji, and Itti, Laurent
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Temporal event representations are an essential aspect of learning among humans. They allow for succinct encoding of the experiences we have through a variety of sensory inputs. Also, they are believed to be arranged hierarchically, allowing for an efficient representation of complex long-horizon experiences. Additionally, these representations are acquired in a self-supervised manner. Analogously, here we propose a model that learns temporal representations from long-horizon visual demonstration data and associated textual descriptions, without explicit temporal supervision. Our method produces a hierarchy of representations that align more closely with ground-truth human-annotated events (+15.3) than state-of-the-art unsupervised baselines. Our results are comparable to heavily-supervised baselines in complex visual domains such as Chess Openings, YouCook2 and TutorialVQA datasets. Finally, we perform ablation studies illustrating the robustness of our approach. We release our code and demo visualizations in the Supplementary Material., Comment: Accepted at ICPR '22
Published: 2020

10. Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Author: Sontakke, Sumedh A., Mehrjou, Arash, Itti, Laurent, and Schölkopf, Bernhard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Animals exhibit an innate ability to learn regularities of the world through interaction. By performing experiments in their environment, they are able to discern the causal factors of variation and infer how they affect the world's dynamics. Inspired by this, we attempt to equip reinforcement learning agents with the ability to perform experiments that facilitate a categorization of the rolled-out trajectories, and to subsequently infer the causal factors of the environment in a hierarchical manner. We introduce {\em causal curiosity}, a novel intrinsic reward, and show that it allows our agents to learn optimal sequences of actions and discover causal factors in the dynamics of the environment. The learned behavior allows the agents to infer a binary quantized representation for the ground-truth causal factors in every environment. Additionally, we find that these experimental behaviors are semantically meaningful (e.g., our agents learn to lift blocks to categorize them by weight), and are learnt in a self-supervised manner with approximately 2.5 times less data than conventional supervised planners. We show that these behaviors can be re-purposed and fine-tuned (e.g., from lifting to pushing or other downstream tasks). Finally, we show that the knowledge of causal factor representations aids zero-shot learning for more complex tasks. Visit https://sites.google.com/usc.edu/causal-curiosity/home for website., Comment: International Conference on Machine Learning, PMLR 139, 2021
Published: 2020

11. Classification of Cardiotocography Signals Using Machine Learning

Author: Sontakke, Sumedh Anand, Lohokare, Jay, Dani, Reshul, Shivagaje, Pranav, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Arai, Kohei, editor, Kapoor, Supriya, editor, and Bhatia, Rahul, editor
Published: 2019
Full Text: View/download PDF

12. Classification of Cardiotocography Signals Using Machine Learning

Author: Sontakke, Sumedh Anand, primary, Lohokare, Jay, additional, Dani, Reshul, additional, and Shivagaje, Pranav, additional
Published: 2018
Full Text: View/download PDF

13. Acquiring Domain Knowledge for Cardiotocography: A Deep Learning Approach

Author: Huddar, Priyamvada Pushkar, primary and Sontakke, Sumedh Anand, additional
Published: 2019
Full Text: View/download PDF

14. Emergency services platform for smart cities

Author: Lohokare, Jay, primary, Dani, Reshul, additional, Sontakke, Sumedh, additional, Apte, Ameya, additional, and Sahni, Rishabh, additional
Published: 2017
Full Text: View/download PDF

15. Scalable tracking system for public buses using IoT technologies

Author: Lohokare, Jay, primary, Dani, Reshul, additional, Sontakke, Sumedh, additional, and Adhao, Rahul, additional
Published: 2017
Full Text: View/download PDF

16. Diagnosis of liver diseases using machine learning

Author: Sontakke, Sumedh, primary, Lohokare, Jay, additional, and Dani, Reshul, additional
Published: 2017
Full Text: View/download PDF

17. Automated data collection for credit score calculation based on financial transactions and social media

Author: Lohokare, Jay, primary, Dani, Reshul, additional, and Sontakke, Sumedh, additional
Published: 2017
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

17 results on '"Sontakke, Sumedh A"'

1. Value Explicit Pretraining for Learning Transferable Representations

2. RoboCLIP: One Demonstration is Enough to Learn Robot Policies

3. Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

4. Lightweight Learner for Shared Knowledge Lifelong Learning

5. RT-1: Robotics Transformer for Real-World Control at Scale

6. Model2Detector: Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps

7. GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

8. Video2Skill: Adapting Events in Demonstration Videos to Skills in an Environment using Cyclic MDP Homomorphisms

9. SHERLock: Self-Supervised Hierarchical Event Representation Learning

10. Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

11. Classification of Cardiotocography Signals Using Machine Learning

12. Classification of Cardiotocography Signals Using Machine Learning

13. Acquiring Domain Knowledge for Cardiotocography: A Deep Learning Approach

14. Emergency services platform for smart cities

15. Scalable tracking system for public buses using IoT technologies

16. Diagnosis of liver diseases using machine learning

17. Automated data collection for credit score calculation based on financial transactions and social media

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

17 results on '"Sontakke, Sumedh A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources