Unsupervised concept-to-text generation with hypergraphs

Ioannis Konstas, Mirella Lapata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

75 Citations (Scopus)

Abstract

Concept-to-text generation refers to the task of automatically producing textual output from non-linguistic input. We present a joint model that captures content selection ("what to say") and surface realization ("how to say") in an unsupervised domain-independent fashion. Rather than breaking up the generation process into a sequence of local decisions, we define a probabilistic context-free grammar that globally describes the inherent structure of the input (a corpus of database records and text describing some of them). We represent our grammar compactly as a weighted hypergraph and recast generation as the task of finding the best derivation tree for a given input. Experimental evaluation on several domains achieves competitive results with state-of-the-art systems that use domain specific constraints, explicit feature engineering or labeled data.

Original languageEnglish
Title of host publicationProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies
PublisherAssociation for Computational Linguistics
Pages752-761
Number of pages10
ISBN (Electronic)9781937284206
Publication statusPublished - Jun 2012
Event2012 Conference of the North American Chapter of the Association for Computational Linguistics - Montreal, Canada
Duration: 3 Jun 20128 Jun 2012

Conference

Conference2012 Conference of the North American Chapter of the Association for Computational Linguistics
Abbreviated titleNAACL HLT 2012
Country/TerritoryCanada
CityMontreal
Period3/06/128/06/12

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Unsupervised concept-to-text generation with hypergraphs'. Together they form a unique fingerprint.

Cite this