Abstract
Shapley effects are enjoying increasing popularity as importance measures. These indices allocate the variance of the quantity of interest among every risk factor, and a risk factor explaining more variance than another one is more important. Recently, Vallarino et al. (ASTIN Bull J IAA, 2023. https://doi.org/10.1017/asb.2023.34) propose a computational strategy for Shapley effects using the idea of cohorts of similar observations. However, this strategy becomes extremely computationally demanding if the dataset contains many observations. In this work we propose a computational shortcut based on design of experiments and clustering techniques to speed up the computational time. Using the well-known French claim frequency dataset, we demonstrate the huge reduction in computational time, without a significant loss of accuracy in the estimation of the Shapley effects.
Original language | English |
---|---|
Journal | European Actuarial Journal |
Early online date | 7 Feb 2025 |
DOIs | |
Publication status | E-pub ahead of print - 7 Feb 2025 |
Keywords
- Conditional Latin hypercube sampling
- Hierarchical k-means
- Large insurance data
- Latin hypercube sampling
- Shapley effects
ASJC Scopus subject areas
- Statistics and Probability
- Economics and Econometrics
- Statistics, Probability and Uncertainty