Abstract
Many practical tasks in robotic systems, such as cleaning windows, writing
or grasping, are inherently constrained. Learning policies subject to
constraints is a challenging problem. In this paper, we propose a
\emph{constraint-aware learning} method that solves the policy learning
problem on redundant robots which execute a policy that is acting in the
null-space of a constraint. In particular, we are interested in generalizing
learnt null-space policies across constraints that were not known during the
training. We split the combined problem of learning constraints and policies
into: first estimating the constraint, and then estimating a null-space policy
using the remaining degrees of freedom. For a linear parametrization, we
provide a closed-form solution of the problem. We also define a metric for
comparing the similarity of estimated constraints which is useful to preprocess
the trajectories recorded in the demonstrations. We have validated
our method by learning a wiping task from human demonstration on flat
surfaces and reproducing it on an unknown curved surface using a
force/torque based controller to achieve tool alignment. We show that,
despite of the differences between the training and validation scenarios,
we learn a policy that still provides the desired wiping motion.
or grasping, are inherently constrained. Learning policies subject to
constraints is a challenging problem. In this paper, we propose a
\emph{constraint-aware learning} method that solves the policy learning
problem on redundant robots which execute a policy that is acting in the
null-space of a constraint. In particular, we are interested in generalizing
learnt null-space policies across constraints that were not known during the
training. We split the combined problem of learning constraints and policies
into: first estimating the constraint, and then estimating a null-space policy
using the remaining degrees of freedom. For a linear parametrization, we
provide a closed-form solution of the problem. We also define a metric for
comparing the similarity of estimated constraints which is useful to preprocess
the trajectories recorded in the demonstrations. We have validated
our method by learning a wiping task from human demonstration on flat
surfaces and reproducing it on an unknown curved surface using a
force/torque based controller to achieve tool alignment. We show that,
despite of the differences between the training and validation scenarios,
we learn a policy that still provides the desired wiping motion.
Original language | English |
---|---|
Pages (from-to) | 1673-1689 |
Number of pages | 17 |
Journal | International Journal of Robotics Research |
Volume | 37 |
Issue number | 13-14 |
Early online date | 26 Jul 2018 |
DOIs | |
Publication status | Published - 1 Dec 2018 |
Fingerprint
Dive into the research topics of 'Constraint-aware Learning of Policies by Demonstration'. Together they form a unique fingerprint.Profiles
-
Mustafa Suphi Erden
- School of Engineering & Physical Sciences - Associate Professor
- School of Engineering & Physical Sciences, Institute of Sensors, Signals & Systems - Associate Professor
Person: Academic (Research & Teaching)