Abstract
Reinforcement learning (RL) based on artificial neural networks (ANN) has seen great successes in recent years. A notable recent breakthrough came from Volodymyr Mnih, where an artificial agent learnt to play games from raw pixels on a screen with scores produced by Atari game console simulation. The algorithm managed to surpass human performance on many classic 1980s games. However, RL has not yet achieved similar successes for real world tasks, such as controlling robotic manipulation. RL requires dozens of hours of gameplay in order to perform at human level. This training time is currently a significant obstacle outside simulation systems. It has been argued that pure RL is data inefficient because the reward signal - the only feedback used - is sparse and contains little information.
One approach to extract more information from
agent's experiences is to train to predict future observa-
tions. In this work, we investigate learning of stochas-
tic forward models of environments from raw sensory
observations. Such models could be then used for prob-
abilistic state estimation, future prediction, planning,
and ultimately, more ecient beheviour. The overar-
ching goal is data-ecient reinforcement learning.
One approach to extract more information from
agent's experiences is to train to predict future observa-
tions. In this work, we investigate learning of stochas-
tic forward models of environments from raw sensory
observations. Such models could be then used for prob-
abilistic state estimation, future prediction, planning,
and ultimately, more ecient beheviour. The overar-
ching goal is data-ecient reinforcement learning.
Original language | English |
---|---|
Number of pages | 1 |
Publication status | Published - 4 Jun 2018 |
Event | 2018 EPSRC CDT Student Conference – Oxford, Bristol and Edinburgh - Bristol, United Kingdom Duration: 4 Jun 2018 → 5 Jun 2018 |
Conference
Conference | 2018 EPSRC CDT Student Conference – Oxford, Bristol and Edinburgh |
---|---|
Country/Territory | United Kingdom |
City | Bristol |
Period | 4/06/18 → 5/06/18 |