TY - JOUR
T1 - I don't understand! Evaluation methods for natural language explanations
AU - Clinciu, Miruna
AU - Eshghi, Arash
AU - Hastie, Helen
N1 - Funding Information:
This work was supported by the EPSRC Centre for Doctoral Training in Robotics and Autonomous Systems at Heriot-Watt University and the University of Edinburgh. Clinciu’s PhD is funded by Schlumberger Cambridge Research Limited (EP/L016834/1, 2018-2021). This work was also supported by the EPSRC ORCA Hub (EP/R026173/1, 2017-2021) and UKRI Trustworthy Autonomous Systems Node on Trust (EP/V026682/1, 2020-2024).
Publisher Copyright:
Copyright © 2021 for this paper by its authors.
PY - 2021/7/2
Y1 - 2021/7/2
N2 - Explainability of intelligent systems is key for future adoption. While much work is ongoing with regards to developing methods of explaining complex opaque systems, there is little current work on evaluating how effective these explanations are, in particular with respect to the user’s understanding. Natural language (NL) explanations can be seen as an intuitive channel between humans and artificial intelligence systems, in particular for enhancing transparency. This paper presents existing work on how evaluation methods from the field of Natural Language Generation (NLG) can be mapped onto NL explanations. Also, we present a preliminary investigation into the relationship between linguistic features and human evaluation, using a dataset of NL explanations derived from Bayesian Networks.
AB - Explainability of intelligent systems is key for future adoption. While much work is ongoing with regards to developing methods of explaining complex opaque systems, there is little current work on evaluating how effective these explanations are, in particular with respect to the user’s understanding. Natural language (NL) explanations can be seen as an intuitive channel between humans and artificial intelligence systems, in particular for enhancing transparency. This paper presents existing work on how evaluation methods from the field of Natural Language Generation (NLG) can be mapped onto NL explanations. Also, we present a preliminary investigation into the relationship between linguistic features and human evaluation, using a dataset of NL explanations derived from Bayesian Networks.
KW - Evaluation
KW - Explanations
KW - Natural language
UR - http://www.scopus.com/inward/record.url?scp=85109792350&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85109792350
SN - 1613-0073
VL - 2894
SP - 17
EP - 24
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - SICSA Workshop on eXplainable Artificial Intelligence 2021
Y2 - 1 June 2021 through 1 June 2021
ER -