Training an adaptive dialogue policy for interactive learning of visually grounded word meanings

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework - Dynamic Syntax and Type Theory with Records (DS-TTR) - with a set of visual classifiers that are learned throughout the interaction and which ground the meaning representations that it produces. We use this system in interaction with a simulated human tutor to study the effects of different dialogue policies and capabilities on accuracy of learned meanings, learning rates, and efforts/costs to the tutor. We show that the overall performance of the learning agent is affected by (1) who takes initiative in the dialogues; (2) the ability to express/use their confidence level about visual attributes; and (3) the ability to process elliptical and incrementally constructed dialogue turns. Ultimately, we train an adaptive dialogue policy which optimises the trade-off between classifier accuracy and tutoring costs.
Original languageEnglish
Title of host publicationProceedings of the SIGDIAL 2016 Conference
PublisherAssociation for Computational Linguistics
Pages339-349
Number of pages11
ISBN (Print)9781945626234
DOIs
Publication statusPublished - 15 Sep 2016
EventThe 17th Annual SIGdial Meeting on Discourse and Dialogue - Institute for Creative Technologies, Los Angeles, United States
Duration: 13 Sep 201615 Sep 2016
http://www.sigdial.org/workshops/conference17/

Conference

ConferenceThe 17th Annual SIGdial Meeting on Discourse and Dialogue
Abbreviated titleSIGDIAL
Country/TerritoryUnited States
CityLos Angeles
Period13/09/1615/09/16
Internet address

Keywords

  • Natural language processing
  • Robotics, Development, Language action, Social interaction, Learning
  • Artificial intelligence

Fingerprint

Dive into the research topics of 'Training an adaptive dialogue policy for interactive learning of visually grounded word meanings'. Together they form a unique fingerprint.

Cite this