First-Order Function Approximation for Transfer Learning in Relational MDPs

Jun Hao Alvin Ng, Ronald P. A. Petrick

Research output: Contribution to conferencePaperpeer-review

Abstract

Planning problems with a first-order structure can be modelled compactly with Relational Markov Decision Processes (RMDPs). If the model is unknown, value-based reinforcement learning methods can be used to solve these problems. The action-value function is approximated with features which are conjunctive ground state fluents. However, this approximation does not exploit the first-order structure of RMDPs and the generated policy can only solve a ground MDP of the RMDP. Our objective is to learn a generalised function approximation which induces a policy that can solve multiple ground MDPs. We achieve this by using conjunctive lifted state fluents as first-order features. This first-order approximation gives better generalisation but has a coarser granularity which can worsen performance. We propose the combination of first-order features and ground features to get both of their strengths. Empirical results for four domains show that our method could generalise over problems regardless of their scales and allow transfer learning.
Original languageEnglish
Publication statusPublished - 5 Aug 2021
Event31st International Conference on Automated Planning and Scheduling: Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning - Online, Guangzhou, China
Duration: 2 Jun 202113 Jun 2021

Conference

Conference31st International Conference on Automated Planning and Scheduling
Abbreviated titleICAPS 2021
Country/TerritoryChina
CityGuangzhou
Period2/06/2113/06/21

Fingerprint

Dive into the research topics of 'First-Order Function Approximation for Transfer Learning in Relational MDPs'. Together they form a unique fingerprint.

Cite this