Am I Allergic to This? Assisting Sight Impaired People in the Kitchen

Elisa Ramil Brick, Vanesa Caballero Alonso, Conor O'Brien, Sheron Tong, Emilie Tavernier, Amit Parekh, John-Angus Addlesee, Oliver Lemon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Downloads (Pure)

Abstract

Sight-Impaired People (SIP) need assistance with food packaging - which can contain safety-critical information. Therefore, Textual Visual Question Answering (VQA) could prove critical in increasing the independence of SIP [26, 35], yet has only seen recent attention. For instance, comprehending text within an image is necessary to determine: what type of soup is in a can, how long to cook a microwave meal, when a box of eggs will expire, and whether a meal contains an ingredient they are allergic to. This handful of examples relate to a kitchen setting - a particularly challenging area for SIP. We extended the existing Aye-saacvoice assistant prototype with this task and setting in mind. We developed textual VQA components to accurately understand what a user is asking, extract relevant text from images in an intelligent manner, and to provide Natural Language answers that build upon the context of previous questions. As our system is created to be assistive, we designed it with a particular focus on privacy, transparency, and controllability. These are vital objectives that existing systems do not cover. We found that our system outperformed other VQA systems on real food packaging questions asked by SIP from the VizWiz dataset [19].

Original languageEnglish
Title of host publicationICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction
Subtitle of host publicationMontréal QC Canada October 18 - 22, 2021
EditorsZakia Hammal, Carlos Busso, Catherine Pelachaud, Sharon Oviatt, Albert Ali Salah, Guoying Zhao
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Pages92-102
Number of pages11
ISBN (Print)978-1-4503-8481-0
DOIs
Publication statusPublished - 18 Oct 2021
Event23rd ACM International Conference on Multimodal Interaction 2021 - Montreal, Canada
Duration: 18 Oct 202122 Oct 2021

Conference

Conference23rd ACM International Conference on Multimodal Interaction 2021
Abbreviated titleICMI 2021
Country/TerritoryCanada
CityMontreal
Period18/10/2122/10/21

Keywords

  • allergens
  • sight-impaired people
  • textual VQA
  • textual extraction

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Am I Allergic to This? Assisting Sight Impaired People in the Kitchen'. Together they form a unique fingerprint.

Cite this