Natural language generation as incremental planning under uncertainty: adaptive information presentation for statistical dialogue systems

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)

Abstract

We present and evaluate a novel approach to natural language generation (NLG) in statistical spoken dialogue systems (SDS) using a data-driven statistical optimization framework for incremental information presentation (IP), where there is a trade-off to be solved between presenting “enough" information to the user while keeping the utterances short and understandable. The trained IP model is adaptive to variation from the current generation context (e.g. a user and a non-deterministic sentence planner), and it incrementally adapts the IP policy at the turn level. Reinforcement learning is used to automatically optimize the IP policy with respect to a data-driven objective function. In a case study on presenting restaurant information, we show that an optimized IP strategy trained on Wizard-of-Oz data outperforms a baseline mimicking the wizard behavior in terms of total reward gained. The policy is then also tested with real users, and improves on a conventional hand-coded IP strategy used in a deployed SDS in terms of overall task success. The evaluation found that the trained IP strategy significantly improves dialogue task completion for real users, with up to a 8.2% increase in task success. This methodology also provides new insights into the nature of the IP problem, which has previously been treated as a module following dialogue management with no access to lower-level context features (e.g. from a surface realizer and/or speech synthesizer).
Original languageEnglish
Pages (from-to)979-994
Number of pages16
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume22
Issue number5
DOIs
Publication statusPublished - May 2014

Fingerprint

Dive into the research topics of 'Natural language generation as incremental planning under uncertainty: adaptive information presentation for statistical dialogue systems'. Together they form a unique fingerprint.

Cite this