Abstract
Shapley effects are enjoying increasing popularity as importance measures. These indices allocate the variance of the quantity of interest among every risk factor, and a risk factor explaining more variance than another one is more important. Recently, Vallarino et al. (ASTIN Bull J IAA, 2023. https://doi.org/10.1017/asb.2023.34) propose a computational strategy for Shapley effects using the idea of cohorts of similar observations. However, this strategy becomes extremely computationally demanding if the dataset contains many observations. In this work we propose a computational shortcut based on design of experiments and clustering techniques to speed up the computational time. Using the well-known French claim frequency dataset, we demonstrate the huge reduction in computational time, without a significant loss of accuracy in the estimation of the Shapley effects.
| Original language | English |
|---|---|
| Pages (from-to) | 885-898 |
| Number of pages | 14 |
| Journal | European Actuarial Journal |
| Volume | 15 |
| Issue number | 3 |
| Early online date | 7 Feb 2025 |
| DOIs | |
| Publication status | Published - Dec 2025 |
Keywords
- Conditional Latin hypercube sampling
- Hierarchical k-means
- Large insurance data
- Latin hypercube sampling
- Shapley effects
ASJC Scopus subject areas
- Statistics and Probability
- Economics and Econometrics
- Statistics, Probability and Uncertainty
Fingerprint
Dive into the research topics of 'Accelerating the computation of Shapley effects for datasets with many observations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver