Learning adaptive referring expression generation policies for spoken dialogue systems

Srinivasan Chandrasekaran Janarthanam, Oliver Lemon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)


We address the problem that different users have different lexical knowledge about problem domains, so that automated dialogue systems need to adapt their generation choices online to the users' domain knowledge as it encounters them. We approach this problem using Reinforcement Learning in Markov Decision Processes (MDP). We present a reinforcement learning framework to learn adaptive referring expression generation (REG) policies that can adapt dynamically to users with different domain knowledge levels. In contrast to related work we also propose a new statistical user model which incorporates the lexical knowledge of different users. We evaluate this framework by showing that it allows us to learn dialogue policies that automatically adapt their choice of referring expressions online to different users, and that these policies are significantly better than hand-coded adaptive policies for this problem. The learned policies are consistently between 2 and 8 turns shorter than a range of different hand-coded but adaptive baseline REG policies.

Original languageEnglish
Title of host publicationEmpirical Methods in Natural Language Generation
Subtitle of host publicationData-oriented Methods and Empirical Evaluation
EditorsEmiel Krahmer, Mariët Theune
Number of pages18
ISBN (Electronic)9783642155734
ISBN (Print)9783642155727
Publication statusPublished - 2010
Event12th European Workshop on Natural Language Generation 2009 - Athens, Greece
Duration: 30 Mar 20093 Apr 2009

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743


Conference12th European Workshop on Natural Language Generation 2009
Abbreviated titleENLG 2009


  • Referring Expression Generation
  • Reinforcement Learning
  • Spoken Dialogue System


Dive into the research topics of 'Learning adaptive referring expression generation policies for spoken dialogue systems'. Together they form a unique fingerprint.

Cite this