Abstract
Recent advances in corpus-based Natural Language Generation (NLG) hold the promise of being easily portable across domains, but require costly training data, consisting of meaning representations (MRs) paired with Natural Language (NL) utterances. In this work, we propose a novel framework for crowd-sourcing high quality NLG training data, using automatic quality control measures and evaluating different MRs with which to elicit
data. We show that pictorial MRs result in better NL data being collected than logic-based MRs: utterances elicited by pictorial MRs are judged as significantly more natural, more informative, and better phrased, with a significant increase in average quality ratings (around 0.5 points on a 6-point scale), compared to using the logical MRs. As the MR becomes more complex, the benefits of pictorial stimuli increase. The collected data will be released as part of this submission.
data. We show that pictorial MRs result in better NL data being collected than logic-based MRs: utterances elicited by pictorial MRs are judged as significantly more natural, more informative, and better phrased, with a significant increase in average quality ratings (around 0.5 points on a 6-point scale), compared to using the logical MRs. As the MR becomes more complex, the benefits of pictorial stimuli increase. The collected data will be released as part of this submission.
Original language | English |
---|---|
Title of host publication | Proceedings of the 9th International Natural Language Generation conference |
Publisher | Association for Computational Linguistics |
Pages | 265-273 |
Publication status | Published - 2016 |
Event | 9th International Natural Language Generation Conference - University of Edinburgh building at 50 George Square , Edinburgh, United Kingdom Duration: 5 Sept 2016 → 8 Sept 2016 http://www.macs.hw.ac.uk/InteractionLab/INLG2016/index.html http://www.macs.hw.ac.uk/InteractionLab/INLG2016/# |
Conference
Conference | 9th International Natural Language Generation Conference |
---|---|
Abbreviated title | INLG 2016 |
Country/Territory | United Kingdom |
City | Edinburgh |
Period | 5/09/16 → 8/09/16 |
Internet address |
Fingerprint
Dive into the research topics of 'Crowd-sourcing NLG Data: Pictures Elicit Better Data'. Together they form a unique fingerprint.Datasets
-
NLG dataset
Novikova, J. (Creator), Rieser, V. (Creator) & Lemon, O. (Creator), Heriot-Watt University, 2016
DOI: 10.17861/f9072c37-3db8-4ddf-9913-aa6f24e61cf0, https://github.com/jeknov/INLG_16_submission
Dataset