Abstract
Recent developments in computer vision and conversational systems have provided the AI community with novel perspectives towards improving the cognitive capabilities of engaging socially assistive robots. We show how to develop conversational skills for a hospital receptionist robot that incorporates social conversation based on visual information as well as task-based dialog. Fusing the traditional modular conversational system architecture with recent developments in computer vision and scene graph research, our agent (called ĝViCA') supports both visual question answering and social conversational capabilities based on the visual scene. In particular, our agent can provide guidance to users by locating visible objects in the room and can engage in social dialog using visual prompts, such as the user's clothing or possessions. We conduct a comprehensive online evaluation study with 21 participants, showcasing that the ViCA system is perceived as both helpful and entertaining.
Original language | English |
---|---|
Title of host publication | ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction |
Publisher | Association for Computing Machinery |
Pages | 71-79 |
Number of pages | 9 |
ISBN (Electronic) | 9781450384810 |
DOIs | |
Publication status | Published - 18 Oct 2021 |
Event | 23rd ACM International Conference on Multimodal Interaction 2021 - Montreal, Canada Duration: 18 Oct 2021 → 22 Oct 2021 |
Conference
Conference | 23rd ACM International Conference on Multimodal Interaction 2021 |
---|---|
Abbreviated title | ICMI 2021 |
Country/Territory | Canada |
City | Montreal |
Period | 18/10/21 → 22/10/21 |
Keywords
- conversational agents
- human robot interaction
- multimodal interaction
ASJC Scopus subject areas
- Computer Science Applications
- Computer Vision and Pattern Recognition
- Hardware and Architecture
- Human-Computer Interaction