Twin Delayed Hierarchical Actor-Critic

Authors :: Matthew Studley
Mihai Anca
Source :: ICARA
Publication Year :: 2021
Publisher :: IEEE, 2021.
Abstract: Hierarchical Reinforcement Learning (HRL) addresses the common problem in sparse rewards environments of having to manually craft a reward function. We present a modified version of the Hierarchical Actor-Critic (HAC) architecture called Twin Delayed HAC (TDHAC), a method capable of sample-efficient learning on environments requiring object interaction. The vanilla algorithm fails to converge on this type of environment, while our method matches the best results so far reported in the literature. We carefully consider each feature added to the original architecture and demonstrate the abilities of TDHAC on the sparse-reward Pick-and-Place environment. To the best of our knowledge, this is the first HRL algorithm successfully applied on an environment requiring object interaction without external enhancements such as demonstrations.

Subjects :: Computer science
business.industry
media_common.quotation_subject
Feature (machine learning)
Reinforcement learning
Robotics
Artificial intelligence
Architecture
Function (engineering)
business
Object (computer science)
media_common

Database :: OpenAIRE
Journal :: 2021 7th International Conference on Automation, Robotics and Applications (ICARA)
Accession number :: edsair.doi...........288342e9be485986323505c4ede352ca

Tools