To Whom are You Talking? A Deep Learning Model to Endow Social Robots with Addressee Estimation Skills

Carlo Mazzola, Marta Romeo, Francesco Rea, Alessandra Sciutti, Angelo Cangelosi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Communicating shapes our social word. For a robot to be considered social and being consequently integrated in our social environment it is fundamental to understand some of the dynamics that rule human-human communication. In this work, we tackle the problem of Addressee Estimation, the ability to understand an utterance's addressee, by interpreting and exploiting non-verbal bodily cues from the speaker. We do so by implementing an hybrid deep learning model composed of convolutional layers and LSTM cells taking as input images portraying the face of the speaker and 2D vectors of the speaker's body posture. Our implementation choices were guided by the aim to develop a model that could be deployed on social robots and be efficient in ecological scenarios. We demonstrate that our model is able to solve the Addressee Estimation problem in terms of addressee localisation in space, from a robot ego-centric point of view.
Original languageEnglish
Title of host publication2023 International Joint Conference on Neural Networks (IJCNN)
PublisherIEEE
ISBN (Electronic)9781665488679
DOIs
Publication statusPublished - 2 Aug 2023
Event2023 International Joint Conference on Neural Networks - Gold Coast, Australia
Duration: 18 Jun 202323 Jun 2023

Conference

Conference2023 International Joint Conference on Neural Networks
Country/TerritoryAustralia
Period18/06/2323/06/23

Keywords

  • Addressee Estimation
  • Deep learning
  • Human activity recognition
  • Human-robot interaction
  • Social Robot

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'To Whom are You Talking? A Deep Learning Model to Endow Social Robots with Addressee Estimation Skills'. Together they form a unique fingerprint.

Cite this