Back to Search
Start Over
A posteriori control densities: Imitation learning from partial observations.
- Source :
-
Pattern Recognition Letters . May2023, Vol. 169, p87-94. 8p. - Publication Year :
- 2023
-
Abstract
- • Imitation Learning with full state-action observations is not manageable in many applications. • Generalizing Imitation Learning to partial observations offers a natural setting for learning. • Facing the problem through Bayesian Inference yields several Behavioral Cloning strategies. • Investigation of the different strategies reveals a rich mathematical structure. • Connection with other strategies are investigated. This paper treats a special case of the Imitation from Observations (IfO) problem. IfO is a generalisation of Imitation Learning from state-only demonstrations. Our treatment of IfO considers the case of feature-only demonstrations. This means that the full state is inaccessible for inference, and imitation must occur on the basis of a limited set of features. We refer to this setting as Imitation from Partial Observations (IfPO). This scenario has the advantage of allowing to address a wider variety of demonstrations, as well as solving the problem of heteromorphic student and teacher. We set out for policy learning methods that extract an executable state-feedback policy, directly from those features, which in the literature is known as Behavioural Cloning. In this theoretical work, we formalize the rational inference model of the student decision maker, devoted to imitation, as a controlled Hidden Markov Model. The IfPO problem is then reformulated as a Maximum Likelihood Estimation problem and treated using Expectation-Maximization. We name the resulting fixed point iterations A Posteriori Control Densities. We compare the presented approach to existing methods in the field and identify potential directions for further development, such as an extension to unknown transition and emission models. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 01678655
- Volume :
- 169
- Database :
- Academic Search Index
- Journal :
- Pattern Recognition Letters
- Publication Type :
- Academic Journal
- Accession number :
- 163308883
- Full Text :
- https://doi.org/10.1016/j.patrec.2023.04.001