8 results on '"Ramamoorthy, Ram"'
Search Results
2. Semi-supervised Learning From Demonstration through Program Synthesis: An Inspection Robot Case Study
- Author
-
Smith, Simón C., Ramamoorthy, Ram, Cardoso, Rafael C., Ferrando, Angelo, Briola, Daniela, Menghi, Claudio, and Ahlbrecht, Tobias
- Abstract
Semi-supervised learning improves the performance of supervised machine learning by leveraging methods from unsupervised learning to extract information not explicitly available in the labels. Through the design of a system that enables a robot to learn inspection strategies from a human operator, we present a hybrid semi-supervised system capable of learning interpretable and verifiable models from demonstrations. The system induces a controller program by learning from immersive demonstrations using sequential importance sampling. These visual servo controllers are parametrised by proportional gains and are visually verifiable through observation of the robot’s position in the environment. Clustering and effective particle size filtering allows the system to discover goals in the state space. These goals are used to label the original demonstration for end-to-end learning of behavioural models. The behavioural models are used for autonomous model predictive control and scrutinised for explanations. We implement causal sensitivity analysis to identify salient objects and generate counterfactual conditional explanations. These features enable decision making interpretation and post hoc discovery of the causes of a failure. The proposed system expands on previous approaches to program synthesis by incorporating repellers in the attribution prior of the sampling process. We successfully learn the hybrid system from an inspection scenario where an unmanned ground vehicle has to inspect, in a specific order, different areas of the environment. The system induces an interpretable computer program of the demonstration that can be synthesised to produce novel inspection behaviours. Importantly, the robot successfully runs the synthesised program on an unseen configuration of the environment while presenting explanations of its autonomous behaviour.
- Published
- 2020
- Full Text
- View/download PDF
3. Autonomous Driving with Interpretable Goal Recognition and Monte Carlo Tree Search
- Author
-
Brewitt, Cillian, Albrecht, Stefano V, Wilhelm, John, Gyevnar, Balint, Eiras, Francisco, Dobre, Mihai, and Ramamoorthy, Ram
- Abstract
—The ability to predict the intentions and driving trajectories of other vehicles is a key problem for autonomous driving. We propose an integrated planning and prediction system which leverages the computational benefit of using a finite space of maneuvers, and extend the approach to planning and prediction of sequences of maneuvers via rational inverse planning to recognise the goals of other vehicles. Goal recognition informs a Monte Carlo Tree Search (MCTS) algorithm to plan optimal maneuvers for the ego vehicle. Our system constructs plans which are explainable by means of rationality. Evaluation in simulations of four urban driving scenarios demonstrate the system’s ability to robustly recognise the goals of other vehicles while generating near-optimal plans.
- Published
- 2020
4. An Optimization-based Motion Planner for Safe Autonomous Driving
- Author
-
Eiras, Francisco, Hawasly, Majd, Albrecht, Stefano V, and Ramamoorthy, Ram
- Abstract
Guaranteeing safety in motion planning is a crucial bottleneck on the path towards wider adoption of autonomous driving technology. A promising direction is to pose safety requirements as planning constraints in nonlinear optimization problems of motion synthesis. However, many implementations of this approach are hindered by uncertain convergence and local optimality of the solutions, affecting the planner’s overall robustness. In this paper, we propose a novel two-stage optimization framework: we first find the solution to a Mixed-Integer Linear Programming (MILP) approximation of the motion synthesis problem, which in turn initializes a second Nonlinear Programming (NLP) formulation. We show that initializing the NLP stage with the MILP solution leads to better convergence, lower costs, and outperforms a state-of-the-art Nonlinear Model Predictive Control baseline in both progress and comfort metrics.
- Published
- 2020
5. Diversity-Aware Recommendation for Human Collectives
- Author
-
Andreadis, P., Ceppi, S., Rovatsos, M., and Ramamoorthy, Ram
- Abstract
Sharing economy applications need to coordinate humans, each of whom may have different preferences over the provided service. Traditional approaches model this as a resource allocation problem and solve it by identifying matches between users and resources. These require knowledge of user preferences and, crucially, assume that they act deterministically or, equivalently, that each of them is expected to accept the proposed match. This assumption is unrealistic for applications like ridesharing and house sharing (like airbnb), where user coordination requires handling of the diversity and uncertainty in human behaviour.We address this shortcoming by proposing a diversity-aware recommender system that leaves the decision-power to users but still assists them in coordinating their activities. We achieve this through taxation, which indirectly modifies users’ preferences over options by imposing a penalty on them. This is applied on options that, if selected, are expected to lead to less favourable outcomes, from the perspective of the collective. The framework we used to identify the options to recommend is composed by three optimisation steps, each of which has a mixed integer linear program at its core. Using a combination of these three programs, we are also able to compute solutions that permit a good trade-off between satisfying the global goals of the collective and the individual users’ interests. We demonstrate the effectiveness of our approach with two experiments in a simulated ridesharing scenario, showing: (a) significantly better coordination results with the approach we propose, than with a set of recommendations in which taxation is not applied and each solution maximises the goal of the collective, (b) that we can propose a recommendation set to users instead of imposing them a single allocation at no loss to the collective, and (c) that our system allows for an adaptive trade-off between conflicting criteria.
- Published
- 2016
6. Structured machine learning models for robustness against different factors of variability in robot control
- Author
-
Davchev, Todor Bozhinov, Hospedales, Timothy, Ramamoorthy, Ram, and Munn, Tim
- Subjects
Goal-conditioned Learning ,Machine learning ,Robot Learning ,Learning from Demonstrations ,Robotics ,Structured Machine Learning ,Reinforcement Learning - Abstract
An important feature of human sensorimotor skill is our ability to learn to reuse them across different environmental contexts, in part due to our understanding of attributes of variability in these environments. This thesis explores how the structure of models used within learning for robot control could similarly help autonomous robots cope with variability, hence achieving skill generalisation. The overarching approach is to develop modular architectures that judiciously combine different forms of inductive bias for learning. In particular, we consider how models and policies should be structured in order to achieve robust behaviour in the face of different factors of variation - in the environment, in objects and in other internal parameters of a policy - with the end goal of more robust, accurate and data-efficient skill acquisition and adaptation. At a high level, variability in skill is determined by variations in constraints presented by the external environment, and in task-specific perturbations that affect the specification of optimal action. A typical example of environmental perturbation would be variation in lighting and illumination, affecting the noise characteristics of perception. An example of task perturbations would be variation in object geometry, mass or friction, and in the specification of costs associated with speed or smoothness of execution. We counteract these factors of variation by exploring three forms of structuring: utilising separate data sets curated according to the relevant factor of variation, building neural network models that incorporate this factorisation into the very structure of the networks, and learning structured loss functions. The thesis is comprised of four projects exploring this theme within robotics planning and prediction tasks. Firstly, in the setting of trajectory prediction in crowded scenes, we explore a modular architecture for learning static and dynamic environmental structure. We show that factorising the prediction problem from the individual representations allows for robust and label efficient forward modelling, and relaxes the need for full model re-training in new environments. This modularity explicitly allows for a more flexible and interpretable adaptation of trajectory prediction models to using pre-trained state of the art models. We show that this results in more efficient motion prediction and allows for performance comparable to the state-of-the-art supervised 2D trajectory prediction. Next, in the domain of contact-rich robotic manipulation, we consider a modular architecture that combines model-free learning from demonstration, in particular dynamic movement primitives (DMP), with modern model-free reinforcement learning (RL), using both on-policy and off-policy approaches. We show that factorising the skill learning problem to skill acquisition and error correction through policy adaptation strategies such as residual learning can help improve the overall performance of policies in the context of contact-rich manipulation. Our empirical evaluation demonstrates how to best do this with DMPs and propose “residual Learning from Demonstration“ (rLfD), a framework that combines DMPs with RL to learn a residual correction policy. Our evaluations, performed both in simulation and on a physical system, suggest that applying residual learning directly in task space and operating on the full pose of the robot can significantly improve the overall performance of DMPs. We show that rLfD offers a gentle to the joints solution that improves the task success and generalisation of DMPs. Last but not least, our study shows that the extracted correction policies can be transferred to different geometries and frictions through few-shot task adaptation. Third, we employ meta learning to learn time-invariant reward functions, wherein both the objectives of a task (i.e., the reward functions) and the policy for performing that task optimally are learnt simultaneously. We propose a novel inverse reinforcement learning (IRL) formulation that allows us to 1) vary the length of execution by learning time-invariant costs, and 2) relax the temporal alignment requirements for learning from demonstration. We apply our method to two different types of cost formulations and evaluate their performance in the context of learning reward functions for simulated placement and peg in hole tasks executed on a 7DoF Kuka IIWA arm. Our results show that our approach enables learning temporally invariant rewards from misaligned demonstration that can also generalise spatially to out of distribution tasks. Finally, we employ our observations to evaluate adversarial robustness in the context of transfer learning from a source trained on CIFAR 100 to a target network trained on CIFAR 10. Specifically, we study the effects of using robust optimisation in the source and target networks. This allows us to identify transfer learning strategies under which adversarial defences are successfully retained, in addition to revealing potential vulnerabilities. We study the extent to which adversarially robust features can preserve their defence properties against black and white-box attacks under three different transfer learning strategies. Our empirical evaluations give insights on how well adversarial robustness under transfer learning can generalise.}
- Published
- 2023
7. 3D segmentation and localization using visual cues in uncontrolled environments
- Author
-
Cuevas Velasquez, Hanz, Fisher, Robert, Ramamoorthy, Ram, and Murray, Iain
- Subjects
3D scene unerstanding ,disparity ,3D segmentation ,semantic segmentaion ,point cloud segmentation ,scene understanding - Abstract
3D scene understanding is an important area in robotics, autonomous vehicles, and virtual reality. The goal of scene understanding is to recognize and localize all the objects around the agent. This is done through semantic segmentation and depth estimation. Current approaches focus on improving the robustness to solve each task but fail in making them efficient for real-time usage. This thesis presents four efficient methods for scene understanding that work in real environments. The methods also aim to provide a solution for 2D and 3D data. The first approach presents a pipeline that combines the block matching algorithm for disparity estimation, an encoder-decoder neural network for semantic segmentation, and a refinement step that uses both outputs to complete the regions that were not labelled or did not have any disparity assigned to them. This method provides accurate results in 3D reconstruction and morphology estimation of complex structures like rose bushes. Due to the lack of datasets of rose bushes and their segmentation, we also made three large datasets. Two of them have real roses that were manually labelled, and the third one was created using a scene modeler and 3D rendering software. The last dataset aims to capture diversity, realism and obtain different types of labelling. The second contribution provides a strategy for real-time rose pruning using visual servoing of a robotic arm and our previous approach. Current methods obtain the structure of the plant and plan the cutting trajectory using only a global planner and assume a constant background. Our method works in real environments and uses visual feedback to refine the location of the cutting targets and modify the planned trajectory. The proposed visual servoing allows the robot to reach the cutting points 94% of the time. This is an improvement compared to only using a global planner without visual feedback, which reaches the targets 50% of the time. To the best of our knowledge, this is the first robot able to prune a complete rose bush in a natural environment. Recent deep learning image segmentation and disparity estimation networks provide accurate results. However, most of these methods are computationally expensive, which makes them impractical for real-time tasks. Our third contribution uses multi-task learning to learn the image segmentation and disparity estimation together end-to-end. The experiments show that our network has at most 1/3 of the parameters of the state-of-the-art of each individual task and still provides competitive results. The last contribution explores the area of scene understanding using 3D data. Recent approaches use point-based networks to do point cloud segmentation and find local relations between points using only the latent features provided by the network, omitting the geometric information from the point clouds. Our approach aggregates the geometric information into the network. Given that the geometric and latent features are different, our network also uses a two-headed attention mechanism to do local aggregation at the latent and geometric level. This additional information helps the network to obtain a more accurate semantic segmentation, in real point cloud data, using fewer parameters than current methods. Overall, the method obtains the state-of-the-art segmentation in the real datasets S3DIS with 69.2% and competitive results in the ModelNet40 and ShapeNetPart datasets.
- Published
- 2022
8. Interactive task learning from corrective feedback
- Author
-
Appelgren, Mattias, Lascarides, Alex, and Ramamoorthy, Ram
- Abstract
In complex teaching scenarios it can be difficult for teachers to exhaustively express all information a learner requires to master a task. However, the teacher, who will have internalised the task's objectives, will be able to identify good and bad actions in specific scenarios and would be able to formulate advice upon observing those scenarios. This thesis focuses on the design, implementation and evaluation of models that enable experts to teach agents through such situated feedback in an Interactive Task Learning (ITL) setting. There is a class of highly natural speech acts which have so far gone largely unexplored in the domain of ITL: how to exploit evidence provided by a teacher when they correct the learning agent by articulating the mistake they just made. The aim of this thesis is to show that such speech acts can be exploited in an ITL to learn a task in a data efficient manner. Further we aim to show that this is made possible by capturing within the learning agent's models the constraints that are imposed by dialogue coherence. A dialogue is coherent if the current utterance relates to a salient part of its dialogue context with a specific coherence relation, such as explanation, contrast, correction, or elaboration. Our model will exploit the semantics of these relations to restrict the set of possible interpretations of the teacher's utterance and how the utterance relates to the objects involved in the action the teacher is giving feedback on. We test our hypothesis on a tower building task where the set of allowed towers is constrained by rules. The agent starts out ignorant of these rules, and perhaps more fundamentally, is also unaware of the domain-level concepts used to define the rules and natural language terms that denote those concepts. We develop an agent which utilises the coherence of the extended dialogue to interpret and disambiguate the teacher's feedback, and utilises this (estimated) interpretation to refine its model of the domain, the mapping from NL descriptions to their denotations, given their observable visual features, and the planning problem being addressed. We extend this model to deal with utterances containing anaphora and to deal with an imperfect teacher: that is, one who occasionally doesn't provide the correct correction in a timely way, and/or who is confident, but wrong, about the learner's ability to identify from her utterance the salient part of the context that it is intended to correct. Finally, we use these ideas to learn the manner in which actions should be performed.
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.