Abstract
We present a neural network based system capable of learning a multimodal representation of images and words. This representation allows for bidirectional grounding of the meaning of words and the visual attributes that they represent, such as colour, size and object name. We also present a new dataset captured specifically for this task.
Original language | English |
---|---|
Title of host publication | HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction |
Publisher | Association for Computing Machinery |
Pages | 445–446 |
Number of pages | 2 |
ISBN (Electronic) | 9781450370578 |
DOIs | |
Publication status | Published - 23 Mar 2020 |
Event | 15th Annual ACM/IEEE International Conference on Human Robot Interaction 2020 - Corn Exchange, Cambridge, United Kingdom Duration: 23 Mar 2020 → 26 Mar 2020 https://humanrobotinteraction.org/2020/ |
Conference
Conference | 15th Annual ACM/IEEE International Conference on Human Robot Interaction 2020 |
---|---|
Abbreviated title | HRI 2020 |
Country/Territory | United Kingdom |
City | Cambridge |
Period | 23/03/20 → 26/03/20 |
Internet address |
Keywords
- Datasets
- Neural networks
- Robotics
- Symbol grounding
- Unsupervised learning
ASJC Scopus subject areas
- Artificial Intelligence
- Human-Computer Interaction
- Electrical and Electronic Engineering