Reinforcement learning in a behaviour-based control architecture for marine archaeology

Gordon William Frost, Francesco Maurelli, David M. Lane

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

We present a novel path planner for adaptive behaviour of an Autonomous Underwater Vehicle (AUV). A behaviour-based architecture forms the foundation of the system with an extra layer which uses experience to learn a policy for modulating the behaviours' weights. In effect, this creates an abstract environment for the Reinforement Learning (RL) agent's state and action space. Subsequently, it simplifies the problem the RL agent is addressing, creating a more stable system. The Episodic Natural Actor Critic (ENAC) RL algorithm is used due to the continuous input and output domains and for the natural actor critic's convergence properties. Adaptiveness of the system is presented in a thruster failure scenario. RL is used in this failure scenario to learn an appropriate policy for the behaviours' weights under the new vehicle dynamics. We apply this control architecture to the domain of marine archaeology which has an inherent problem of navigation in unknown, potentially complex and dangerous environments. Simulated results of the proposed control architecture demonstrate its feasibility and performance.

Original languageEnglish
Title of host publicationOCEANS 2015 - Genova
Subtitle of host publicationDiscovering Sustainable Ocean Energy for a New World
PublisherIEEE
ISBN (Print)9781479987368
DOIs
Publication statusPublished - 2015
EventMTS/IEEE OCEANS 2015 - Genova, Italy
Duration: 18 May 201521 May 2015

Conference

ConferenceMTS/IEEE OCEANS 2015
Abbreviated titleOCEANS 2015
Country/TerritoryItaly
CityGenova
Period18/05/1521/05/15

ASJC Scopus subject areas

  • Renewable Energy, Sustainability and the Environment
  • Oceanography

Fingerprint

Dive into the research topics of 'Reinforcement learning in a behaviour-based control architecture for marine archaeology'. Together they form a unique fingerprint.

Cite this