Abstract
Many practical tasks in robotic systems, such as cleaning windows, writing
or grasping, are inherently constrained. Learning policies subject to
constraints is a challenging problem. In this paper, we propose a
\emph{constraint-aware learning} method that solves the policy learning
problem on redundant robots which execute a policy that is acting in the
null-space of a constraint. In particular, we are interested in generalizing
learnt null-space policies across constraints that were not known during the
training. We split the combined problem of learning constraints and policies
into: first estimating the constraint, and then estimating a null-space policy
using the remaining degrees of freedom. For a linear parametrization, we
provide a closed-form solution of the problem. We also define a metric for
comparing the similarity of estimated constraints which is useful to preprocess
the trajectories recorded in the demonstrations. We have validated
our method by learning a wiping task from human demonstration on flat
surfaces and reproducing it on an unknown curved surface using a
force/torque based controller to achieve tool alignment. We show that,
despite of the differences between the training and validation scenarios,
we learn a policy that still provides the desired wiping motion.
or grasping, are inherently constrained. Learning policies subject to
constraints is a challenging problem. In this paper, we propose a
\emph{constraint-aware learning} method that solves the policy learning
problem on redundant robots which execute a policy that is acting in the
null-space of a constraint. In particular, we are interested in generalizing
learnt null-space policies across constraints that were not known during the
training. We split the combined problem of learning constraints and policies
into: first estimating the constraint, and then estimating a null-space policy
using the remaining degrees of freedom. For a linear parametrization, we
provide a closed-form solution of the problem. We also define a metric for
comparing the similarity of estimated constraints which is useful to preprocess
the trajectories recorded in the demonstrations. We have validated
our method by learning a wiping task from human demonstration on flat
surfaces and reproducing it on an unknown curved surface using a
force/torque based controller to achieve tool alignment. We show that,
despite of the differences between the training and validation scenarios,
we learn a policy that still provides the desired wiping motion.
| Original language | English |
|---|---|
| Pages (from-to) | 1673-1689 |
| Number of pages | 17 |
| Journal | International Journal of Robotics Research |
| Volume | 37 |
| Issue number | 13-14 |
| Early online date | 26 Jul 2018 |
| DOIs | |
| Publication status | Published - 1 Dec 2018 |
Fingerprint
Dive into the research topics of 'Constraint-aware Learning of Policies by Demonstration'. Together they form a unique fingerprint.Profiles
-
Mustafa Suphi Erden
- School of Engineering & Physical Sciences - Professor
- School of Engineering & Physical Sciences, Institute of Sensors, Signals & Systems - Professor
Person: Academic (Research & Teaching)