Learning non-cooperative behaviour for dialogue agents

Ioannis Efstathiou, Oliver Lemon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Non-cooperative dialogue behaviour for artificial agents (e.g. deception and information hiding) has been identified as important in a variety of application areas, including education and health-care, but it has not yet been addressed using modern statistical approaches to dialogue agents. Deception has also been argued to be a requirement for high-order intentionality in AI. We develop and evaluate a statistical dialogue agent using Reinforcement Learning which learns to perform non-cooperative dialogue moves in order to complete its own objectives in a stochastic trading game with imperfect information. We show that, when given the ability to perform both cooperative and non-cooperative dialogue moves, such an agent can learn to bluff and to lie so as to win more games. For example, we show that a non-cooperative dialogue agent learns to win 10.5% more games than a strong rule-based adversary, when compared to an optimised agent which cannot perform non-cooperative moves. This work is the first to show how agents can learn to use dialogue in a non-cooperative way to meet their own goals.

Original languageEnglish
Title of host publicationFrontiers in Artificial Intelligence and Applications
PublisherIOS Press
Pages999-1000
Number of pages2
Volume263
ISBN (Print)9781614994183
DOIs
Publication statusPublished - 2014
Event21st European Conference on Artificial Intelligence - Prague, United Kingdom
Duration: 18 Aug 201422 Aug 2014

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume263
ISSN (Print)09226389

Conference

Conference21st European Conference on Artificial Intelligence
Abbreviated titleECAI 2014
Country/TerritoryUnited Kingdom
CityPrague
Period18/08/1422/08/14

Fingerprint

Dive into the research topics of 'Learning non-cooperative behaviour for dialogue agents'. Together they form a unique fingerprint.

Cite this