Challenging neural dialogue models with natural data: Memory networks fail on incremental phenomena

Igor Shalyminov, Arash Eshghi, Oliver Lemon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Natural, spontaneous dialogue proceeds incrementally on a word-by-word basis; and it contains many sorts of disfluency such as mid-utterance/sentence hesitations, interruptions, and self-corrections. But training data for machine learning approaches to dialogue processing is often either cleaned-up or wholly synthetic in order to avoid such phenomena. The question then arises of how well systems trained on such clean data generalise to real spontaneous dialogue, or indeed whether they are trainable at all on naturally occurring dialogue data. To answer this question, we created a new corpus called bAbI+ by systematically adding natural spontaneous incremental dialogue phenomena such as restarts and self-corrections to the Facebook AI Research’s bAbI dialogues dataset. We then explore the performance of a state-of-the-art retrieval model, MemN2N (Bordes et al., 2017; Sukhbaatar et al., 2015), on this more natural dataset. Results show that the semantic accuracy of the MemN2N model drops drastically; and that although it is in principle able to learn to process the constructions in bAbI+, it needs an impractical amount of training data to do so. Finally, we go on to show that an incremental, semantic parser – DyLan – shows 100% semantic accuracy on both bAbI and bAbI+, highlighting the generalisation properties of linguistically informed dialogue models.
Original languageEnglish
Title of host publicationProceedings of the 21st Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2017 - SaarDial)
EditorsVolha Petukhova, Ye Tian
Pages125-133
Publication statusPublished - Aug 2017
Event21st Workshop on the Semantics and Pragmatics of Dialogue - Saarbrücken, Germany
Duration: 15 Aug 201717 Aug 2017
Conference number: 21
http://www.saardial.uni-saarland.de/?page_id=2

Workshop

Workshop21st Workshop on the Semantics and Pragmatics of Dialogue
Abbreviated titleSemdial 2017 - Saardial
Country/TerritoryGermany
CitySaarbrücken
Period15/08/1717/08/17
Internet address

Fingerprint

Dive into the research topics of 'Challenging neural dialogue models with natural data: Memory networks fail on incremental phenomena'. Together they form a unique fingerprint.

Cite this