Combining Visual and Social Dialogue for Human-Robot Interaction

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)
54 Downloads (Pure)


We will demonstrate a prototype multimodal conversational AI system that will act as a receptionist in a hospital waiting room, combining visually-grounded dialogue with social conversation. The system supports visual object conversation in the waiting room (e.g. looking for available seats or personal belongings), task-based dialogues regarding navigation and check-in procedures in the hospital, as well as access to the latest news, and a quiz game about coronavirus.
The prototype system therefore demonstrates how to weave together a wide range of natural, daily conversations with end users that vary in complexity; from complex visual dialogue to chitchat and quiz games, to task-oriented domain-specific conversations.
We are currently able to demonstrate the system via a web-based interface. It will soon be deployed on the ARI robot in a hospital waiting room.
Original languageEnglish
Title of host publicationICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction
Subtitle of host publicationMontréal QC Canada October 18 - 22, 2021
EditorsZakia Hammal, Carlos Busso, Catherine Pelachaud, Sharon Oviatt, Albert Ali Salah, Guoying Zhao
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Number of pages2
ISBN (Print)9781450384810
Publication statusPublished - 18 Oct 2021
Event23rd ACM International Conference on Multimodal Interaction 2021 - Montreal, Canada
Duration: 18 Oct 202122 Oct 2021


Conference23rd ACM International Conference on Multimodal Interaction 2021
Abbreviated titleICMI 2021


  • Social Dialogue
  • Visual Dialogue

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Human-Computer Interaction


Dive into the research topics of 'Combining Visual and Social Dialogue for Human-Robot Interaction'. Together they form a unique fingerprint.

Cite this