Publication Year Range: Last 50 years / Publisher: ios press / Topic: 4 selected - Searchworks@Jio Institute Digital Library Search Results

Searchworks

Author: Lenga, Jinsong, Fyfe, Colin, and Jain, Lakhmi C.
Subjects: ALGORITHMS, MATHEMATICAL optimization, LINEAR programming, SYSTEMS engineering, MATHEMATICAL programming, MACHINE learning
Abstract: Temporal difference learning and eligibility traces are two mechanisms for solving reinforcement learning problems. The temporal difference technique bootstraps the state value or state-action value at every step as with dynamic programming, and learns by sampling episodes from experience as in the Monte Carlo approach. Eligibility traces is a mechanism that offers a means for recording the degree of which state is eligible for undergoing learning process. This paper aims to investigate the underlying mechanism of eligibility traces strategies using on-policy and off-policy learning algorithms. In doing so, the performance metrics can be obtained by defining the learning problem in a simulation environment, in conjunction with different learning algorithms. However, measuring learning performance and analysing sensibility are very expensive because such performance metrics can only be obtained by running an experiment with different parameter values. This paper proposes a comparative study for analysing the mechanism of eligibility traces. The objective of this paper is to compare and investigate the influences on performance caused by those different approaches. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

Searchworks