Inducing document plans for concept-to-text generation

Ioannis Konstas, Mirella Lapata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Citations (Scopus)

Abstract

In a language generation system, a content planner selects which elements must be included in the output text and the ordering between them. Recent empirical approaches perform content selection without any ordering and have thus no means to ensure that the output is coherent. In this paper we focus on the problem of generating text from a database and present a trainable end-to-end generation system that includes both content selection and ordering. Content plans are represented intuitively by a set of grammar rules that operate on the document level and are acquired automatically from training data. We develop two approaches: the first one is inspired from Rhetorical Structure Theory and represents the document as a tree of discourse relations between database records; the second one requires little linguistic sophistication and uses tree structures to represent global patterns of database record sequences within a document. Experimental evaluation on two domains yields considerable improvements over the state of the art for both approaches.

Original languageEnglish
Title of host publicationProceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics
Pages1503-1514
Number of pages12
ISBN (Electronic)9781937284978
Publication statusPublished - Oct 2013
Event2013 Conference on Empirical Methods in Natural Language Processing - Seattle, United States
Duration: 18 Oct 201321 Oct 2013

Conference

Conference2013 Conference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2013
Country/TerritoryUnited States
CitySeattle
Period18/10/1321/10/13

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Inducing document plans for concept-to-text generation'. Together they form a unique fingerprint.

Cite this