Abstract
We present and evaluate a novel approach to natural language generation (NLG) in statistical spoken dialogue systems (SDS) using
a data-driven statistical optimisation framework for incremental information presentation (IP), where there is a trade-off to be solved between presenting ``enough" information to the user while keeping the utterances short and understandable. The trained IP model is adaptive to variation from the current generation context (e.g.\ a user and a non-deterministic sentence planner), and it incrementally adapts the IP policy at the turn level. Reinforcement learning is used to automatically optimise the IP policy with respect to a data-driven objective function. In a case study on presenting restaurant information, we show that an optimised IP strategy trained on WoZ data outperforms a baseline mimicking the wizard behaviour in terms of total reward gained. The policy is then also
tested with real users, and improves on a conventional hand-coded IP strategy used in a deployed SDS in terms of overall task success.
The evaluation found that the trained information presentation strategy significantly improves dialogue task completion for real users, with up to a 8.2\% increase in task success. This methodology also provides new insights into the nature of the IP
problem, which has previously been treated as a module following dialogue management with no access to lower-level context features
(e.g.\ from a surface realiser and/or speech synthesiser).
a data-driven statistical optimisation framework for incremental information presentation (IP), where there is a trade-off to be solved between presenting ``enough" information to the user while keeping the utterances short and understandable. The trained IP model is adaptive to variation from the current generation context (e.g.\ a user and a non-deterministic sentence planner), and it incrementally adapts the IP policy at the turn level. Reinforcement learning is used to automatically optimise the IP policy with respect to a data-driven objective function. In a case study on presenting restaurant information, we show that an optimised IP strategy trained on WoZ data outperforms a baseline mimicking the wizard behaviour in terms of total reward gained. The policy is then also
tested with real users, and improves on a conventional hand-coded IP strategy used in a deployed SDS in terms of overall task success.
The evaluation found that the trained information presentation strategy significantly improves dialogue task completion for real users, with up to a 8.2\% increase in task success. This methodology also provides new insights into the nature of the IP
problem, which has previously been treated as a module following dialogue management with no access to lower-level context features
(e.g.\ from a surface realiser and/or speech synthesiser).
Original language | English |
---|---|
Pages (from-to) | 979-994 |
Number of pages | 16 |
Journal | ACM Transactions on Speech and Language Processing |
Volume | 22 |
Issue number | 5 |
DOIs | |
Publication status | Published - 3 Apr 2014 |