TY - JOUR
T1 - Activity Recognition for Ambient Assisted Living with Videos, Inertial Units and Ambient Sensors
AU - Ranieri, Caetano Mazzoni
AU - MacLeod, Scott
AU - Dragone, Mauro
AU - Vargas, Patricia Amancio
AU - Romero, Roseli Aparecida Francelin
N1 - Funding Information:
This research was funded by the Sao Paulo Research Foundation, grants 2017/02377-5, 2018/25902-0 and 2017/01687-0, and METRICS (H2020-ICT-2019-2-#871252). It was carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by FAPESP, grant 2013/07375-0. Additional resources were provided by the Nvidia Grants program.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - Worldwide demographic projections point to a progressively older population. This fact has fostered research on Ambient Assisted Living, which includes developments on smart homes and social robots. To endow such environments with truly autonomous behaviours, algorithms must extract semantically meaningful information from whichever sensor data is available. Human activity recognition is one of the most active fields of research within this context. Proposed approaches vary according to the input modality and the environments considered. Different from others, this paper addresses the problem of recognising heterogeneous activities of daily living centred in home environments considering simultaneously data from videos, wearable IMUs and ambient sensors. For this, two contributions are presented. The first is the creation of the Heriot-Watt University/University of Sao Paulo (HWU-USP) activities dataset, which was recorded at the Robotic Assisted Living Testbed at Heriot-Watt University. This dataset differs from other multimodal datasets due to the fact that it consists of daily living activities with either periodical patterns or long-term dependencies, which are captured in a very rich and heterogeneous sensing environment. In particular, this dataset combines data from a humanoid robot's RGBD (RGB + depth) camera, with inertial sensors from wearable devices, and ambient sensors from a smart home. The second contribution is the proposal of a Deep Learning (DL) framework, which provides multimodal activity recognition based on videos, inertial sensors and ambient sensors from the smart home, on their own or fused to each other. The classification DL framework has also validated on our dataset and on the University of Texas at Dallas Multimodal Human Activities Dataset (UTD-MHAD), a widely used benchmark for activity recognition based on videos and inertial sensors, providing a comparative analysis between the results on the two datasets considered. Results demonstrate that the introduction of data from ambient sensors expressively improved the accuracy results.
AB - Worldwide demographic projections point to a progressively older population. This fact has fostered research on Ambient Assisted Living, which includes developments on smart homes and social robots. To endow such environments with truly autonomous behaviours, algorithms must extract semantically meaningful information from whichever sensor data is available. Human activity recognition is one of the most active fields of research within this context. Proposed approaches vary according to the input modality and the environments considered. Different from others, this paper addresses the problem of recognising heterogeneous activities of daily living centred in home environments considering simultaneously data from videos, wearable IMUs and ambient sensors. For this, two contributions are presented. The first is the creation of the Heriot-Watt University/University of Sao Paulo (HWU-USP) activities dataset, which was recorded at the Robotic Assisted Living Testbed at Heriot-Watt University. This dataset differs from other multimodal datasets due to the fact that it consists of daily living activities with either periodical patterns or long-term dependencies, which are captured in a very rich and heterogeneous sensing environment. In particular, this dataset combines data from a humanoid robot's RGBD (RGB + depth) camera, with inertial sensors from wearable devices, and ambient sensors from a smart home. The second contribution is the proposal of a Deep Learning (DL) framework, which provides multimodal activity recognition based on videos, inertial sensors and ambient sensors from the smart home, on their own or fused to each other. The classification DL framework has also validated on our dataset and on the University of Texas at Dallas Multimodal Human Activities Dataset (UTD-MHAD), a widely used benchmark for activity recognition based on videos and inertial sensors, providing a comparative analysis between the results on the two datasets considered. Results demonstrate that the introduction of data from ambient sensors expressively improved the accuracy results.
KW - Deep learning
KW - Human activity recognition
KW - Human–robot interaction
KW - Inertial sensors
KW - Multimodal datasets
KW - Video classification
UR - http://www.scopus.com/inward/record.url?scp=85099753532&partnerID=8YFLogxK
U2 - 10.3390/s21030768
DO - 10.3390/s21030768
M3 - Article
C2 - 33498829
SN - 1424-8220
VL - 21
JO - Sensors
JF - Sensors
IS - 3
M1 - 768
ER -