Start Over

A Conclusive Analysis of the Finite-Time Behavior of the Discretized Pursuit Learning Automaton

Authors :: B. John Oommen
Xuan Zhang
Lei Jiao
Ole-Christoffer Granmo
Source :: IEEE Transactions on Neural Networks and Learning Systems
Publication Year :: 2020
Publisher :: Institute of Electrical and Electronics Engineers (IEEE), 2020.
Abstract: Author's accepted version (post-print). © 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Available from 20/03/2021. This paper deals with the finite-time analysis (FTA) of learning automata (LA), which is a topic for which very little work has been reported in the literature. This is as opposed to the asymptotic steady-state analysis for which there are, probably, scores of papers. As clarified later, unarguably, the FTA of Markov chains, in general, and of LA, in particular, is far more complex than the asymptotic steady-state analysis. Such an FTA provides rigid bounds for the time required for the LA to attain to a given convergence accuracy. We concentrate on the FTA of the Discretized Pursuit Automaton (DPA), which is probably one of the fastest and most accurate reported LA. Although such an analysis was carried out many years ago, we record that the previous work is flawed. More specifically, in all brevity, the flaw lies in the wrongly ``derived'' monotonic behavior of the LA after a certain number of iterations. Rather, we claim that the property should be invoked is the submartingale property. This renders the proof to be much more involved and deep. In this paper, we rectify the flaw and reestablish the FTA based on such a submartingale phenomenon. More importantly, from the derived analysis, we are able to discover and clarify, for the first time, the underlying dilemma between the DPA's exploitation and exploration properties. We also nontrivially confirm the existence of the optimal learning rate, which yields a better comprehension of the DPA itself.

Subjects :: Property (philosophy)
Learning automata
Discretization
Markov chain
Computer Networks and Communications
Computer science
Markov process
Monotonic function
02 engineering and technology
VDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420
Computer Science Applications
Automaton
symbols.namesake
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
symbols
020201 artificial intelligence & image processing
Mathematical economics
Software

Details

ISSN :: 21622388 and 2162237X
Volume :: 31
Database :: OpenAIRE
Journal :: IEEE Transactions on Neural Networks and Learning Systems
Accession number :: edsair.doi.dedup.....10bea0dd36b295d0ffd9292ab6d6b010
Full Text :: https://doi.org/10.1109/tnnls.2019.2900639

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A Conclusive Analysis of the Finite-Time Behavior of the Discretized Pursuit Learning Automaton

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A Conclusive Analysis of the Finite-Time Behavior of the Discretized Pursuit Learning Automaton

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources