Training a statistical surface realiser from automatic slot labelling

Heriberto Cuayahuitl, Nina Dethlefs, Helen Hastie, Xingkun Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)


Training a statistical surface realiser typically relies on labelled training data or parallel data sets, such as corpora of paraphrases. The procedure for obtaining such data for new domains is not only time-consuming, but it also restricts the incorporation of new semantic slots during an interaction, i.e. using an online learning scenario for automatically extended domains. Here, we present an alternative approach to statistical surface realisation from unlabelled data through automatic semantic slot labelling. The essence of our algorithm is to cluster clauses based on a similarity function that combines lexical and semantic information. Annotations need to be reliable enough to be utilised within a spoken dialogue system. We compare different similarity functions and evaluate our surface realiser - trained from unlabelled data - in a human rating study. Results confirm that a surface realiser trained from automatic slot labels can lead to outputs of comparable quality to outputs trained from human-labelled inputs.

Original languageEnglish
Title of host publication2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings
Number of pages6
ISBN (Print)9781479971299
Publication statusPublished - 2014
Event6th IEEE Workshop on Spoken Language Technology 2014 - South Lake Tahoe, United States
Duration: 7 Dec 201410 Dec 2014


Conference6th IEEE Workshop on Spoken Language Technology 2014
Abbreviated titleSLT 2014
Country/TerritoryUnited States
CitySouth Lake Tahoe


  • Dialogue systems
  • Semantic slot labelling
  • Surface realisation
  • Unsupervised and supervised learning

ASJC Scopus subject areas

  • Computer Science Applications
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence
  • Language and Linguistics


Dive into the research topics of 'Training a statistical surface realiser from automatic slot labelling'. Together they form a unique fingerprint.

Cite this