TY - CHAP
T1 - Trust in Robot Benchmarking and Benchmarking for Trustworthy Robots
AU - Thoduka, Santosh
AU - Nair, Deebul
AU - Caleb-Solly, Praminda
AU - Dragone, Mauro
AU - Cavallo, Filippo
AU - Hochgeschwender, Nico
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/6/5
Y1 - 2024/6/5
N2 - Trustworthy evaluation of robots is necessary for them to be deployed and accepted in society. Scientific benchmarking competitions provide a way to evaluate robots outside of lab conditions. We propose a progressive and iterative benchmarking process through competitions, which incorporates an objective dataset-based evaluation, evaluation on a remote robot, and field evaluations for individual robot functionalities and complete tasks, in a cyclical process similar to the machine learning lifecycle, with a view to achieving trustworthy evaluation. The inclusion of out-of-distribution data, failure scenarios and user studies as part of the benchmarking process addresses the necessity to evaluate robot systems on non-functional qualities such as fault tolerance, adaptability, social acceptance, in addition to their functional abilities to improve trustworthiness.
AB - Trustworthy evaluation of robots is necessary for them to be deployed and accepted in society. Scientific benchmarking competitions provide a way to evaluate robots outside of lab conditions. We propose a progressive and iterative benchmarking process through competitions, which incorporates an objective dataset-based evaluation, evaluation on a remote robot, and field evaluations for individual robot functionalities and complete tasks, in a cyclical process similar to the machine learning lifecycle, with a view to achieving trustworthy evaluation. The inclusion of out-of-distribution data, failure scenarios and user studies as part of the benchmarking process addresses the necessity to evaluate robot systems on non-functional qualities such as fault tolerance, adaptability, social acceptance, in addition to their functional abilities to improve trustworthiness.
KW - Robot benchmarking
KW - Trustworthiness
UR - http://www.scopus.com/inward/record.url?scp=85198064682&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-55817-7_3
DO - 10.1007/978-3-031-55817-7_3
M3 - Chapter
AN - SCOPUS:85198064682
SN - 9783031558160
T3 - Studies in Computational Intelligence
SP - 31
EP - 51
BT - Producing Artificial Intelligent Systems
PB - Springer
ER -