Back to Search Start Over

Twin Delayed Hierarchical Actor-Critic

Authors :
Matthew Studley
Mihai Anca
Source :
ICARA
Publication Year :
2021
Publisher :
IEEE, 2021.

Abstract

Hierarchical Reinforcement Learning (HRL) addresses the common problem in sparse rewards environments of having to manually craft a reward function. We present a modified version of the Hierarchical Actor-Critic (HAC) architecture called Twin Delayed HAC (TDHAC), a method capable of sample-efficient learning on environments requiring object interaction. The vanilla algorithm fails to converge on this type of environment, while our method matches the best results so far reported in the literature. We carefully consider each feature added to the original architecture and demonstrate the abilities of TDHAC on the sparse-reward Pick-and-Place environment. To the best of our knowledge, this is the first HRL algorithm successfully applied on an environment requiring object interaction without external enhancements such as demonstrations.

Details

Database :
OpenAIRE
Journal :
2021 7th International Conference on Automation, Robotics and Applications (ICARA)
Accession number :
edsair.doi...........288342e9be485986323505c4ede352ca