Abstract
This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. We also establish a baseline on this dataset, which illustrates some of the difficulties associated with this data.
Original language | English |
---|---|
Title of host publication | Proceedings of the SIGDIAL 2017 Conference |
Place of Publication | Stroudsburg, PA, USA |
Publisher | Association for Computational Linguistics |
Pages | 201-206 |
Number of pages | 6 |
ISBN (Electronic) | 978-1-945626-82-1 |
Publication status | Published - 16 Aug 2017 |
Event | 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue - Universität des Saarlandes, Saarbrücken, Germany Duration: 15 Aug 2017 → 17 Aug 2017 http://www.sigdial.org/workshops/conference18/ |
Conference
Conference | 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue |
---|---|
Abbreviated title | SIGDIAL 2017 |
Country/Territory | Germany |
City | Saarbrücken |
Period | 15/08/17 → 17/08/17 |
Internet address |
Fingerprint
Dive into the research topics of 'The E2E Dataset: New Challenges For End-to-End Generation'. Together they form a unique fingerprint.Datasets
-
The E2E Challenge Dataset
Novikova, J. (Creator), Dusek, O. (Creator) & Rieser, V. (Creator), Heriot-Watt University, Nov 2017
DOI: 10.17861/c07fe36b-d0fa-4bbe-967c-f8b4341a133f, http://www.macs.hw.ac.uk/InteractionLab/E2E/
Dataset