Sight-Impaired People (SIP) need assistance with food packaging - which can contain safety-critical information. Therefore, Textual Visual Question Answering (VQA) could prove critical in increasing the independence of SIP [26, 35], yet has only seen recent attention. For instance, comprehending text within an image is necessary to determine: what type of soup is in a can, how long to cook a microwave meal, when a box of eggs will expire, and whether a meal contains an ingredient they are allergic to. This handful of examples relate to a kitchen setting - a particularly challenging area for SIP. We extended the existing Aye-saacvoice assistant prototype with this task and setting in mind. We developed textual VQA components to accurately understand what a user is asking, extract relevant text from images in an intelligent manner, and to provide Natural Language answers that build upon the context of previous questions. As our system is created to be assistive, we designed it with a particular focus on privacy, transparency, and controllability. These are vital objectives that existing systems do not cover. We found that our system outperformed other VQA systems on real food packaging questions asked by SIP from the VizWiz dataset .
|Title of host publication||ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction|
|Subtitle of host publication||Montréal QC Canada October 18 - 22, 2021|
|Editors||Zakia Hammal, Carlos Busso, Catherine Pelachaud, Sharon Oviatt, Albert Ali Salah, Guoying Zhao|
|Place of Publication||New York|
|Publisher||Association for Computing Machinery|
|Number of pages||11|
|Publication status||Published - 18 Oct 2021|
|Event||23rd ACM International Conference on Multimodal Interaction 2021 - Montreal, Canada|
Duration: 18 Oct 2021 → 22 Oct 2021
|Conference||23rd ACM International Conference on Multimodal Interaction 2021|
|Abbreviated title||ICMI 2021|
|Period||18/10/21 → 22/10/21|
- sight-impaired people
- textual VQA
- textual extraction
ASJC Scopus subject areas
- Computer Science Applications
- Computer Vision and Pattern Recognition
- Hardware and Architecture
- Human-Computer Interaction
FingerprintDive into the research topics of 'Am I Allergic to This? Assisting Sight Impaired People in the Kitchen'. Together they form a unique fingerprint.
Dataset supporting the paper "Am I Allergic to This? Assisting Sight Impaired People in the Kitchen"
Brick, E. R. (Creator), Alonso, V. C. (Creator), O'Brien, C. (Creator), Tong, S. (Creator), Tavernier, E. (Creator), Parekh, A. (Creator), Addlesee, J. (Creator) & Lemon, O. (Creator), Heriot-Watt University, Oct 2021