The True Online Continuous Learning Automation (TOCLA) in a continuous control benchmarking of actor-critic algorithms

Gordon William Frost, Marta Vallejo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

83 Downloads (Pure)

Abstract

Reinforcement learning problems are often discretised, use linear function approximation, or perform batch updates. However, many applications that can benefit from reinforcement learning contain continuous variables and are inherently non-linear, for example, the control of aerospace or maritime robotic vehicles. Recent work has brought focus onto online temporal difference methods, specifically for using non-linear function approximation. In this paper, we evaluate the Forward Actor-Critic against the regular Actor-Critic, and Continuous Actor-Critic Learning Automation. We also propose and evaluate a new algorithm called True Online Continuous Learning Automation (TOCLA) which combines these two approaches. The chosen benchmark problem was the Mountain Car Continuous-vO environment from OpenAI Gym, which represents a further step in complexity over the benchmark used to test the Forward Actor Critic in previous works. Our results demonstrate the superiority of TOCLA in terms of its sensitivity to hyper-parameter selection compared with the Forward Actor Critic, Continuous Actor Critic Learning Automation, and Actor Critic algorithms.
Original languageEnglish
Title of host publication2020 IEEE Symposium Series on Computational Intelligence (SSCI)
PublisherIEEE
Pages266-275
Number of pages10
ISBN (Electronic)9781728125473
DOIs
Publication statusPublished - 5 Jan 2021
Event2020 IEEE Symposium Series on Computational Intelligence - Camberra, Australia
Duration: 1 Dec 20204 Dec 2020
Conference number: 46
http://www.ieeessci2020.org/

Conference

Conference2020 IEEE Symposium Series on Computational Intelligence
Abbreviated titleSSCI 2020
Country/TerritoryAustralia
CityCamberra
Period1/12/204/12/20
Internet address

Keywords

  • Actor-Critic
  • CACLA
  • Forward Actor-Critic
  • Nonlinear Function Approximation
  • Reinforcement Learning
  • TOCLA

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Decision Sciences (miscellaneous)

Fingerprint

Dive into the research topics of 'The True Online Continuous Learning Automation (TOCLA) in a continuous control benchmarking of actor-critic algorithms'. Together they form a unique fingerprint.

Cite this