Abstract
Generalized Additive Models (GAMs) are widely used in statistics. In this work, we aim to tackle the challenge of identifying the most influential variables in GAMs. To accomplish this, we introduce a variance allocation approach based on the Shapley value. We derive a closed-form expression for this importance index, which allows for their computation on high-dimensional datasets and with any dependence structure. We discuss the practical implication that when a variable’s importance is negligible, it can be safely eliminated from the GAM, simplifying the model. Through our case studies, we demonstrate that Shapley values offer more informative insights than p-values in terms of ranking the importance of variables. All the code is available online in the supplementary material.
Original language | English |
---|---|
Pages (from-to) | 1-10 |
Number of pages | 10 |
Journal | Journal of Computational and Graphical Statistics |
Early online date | 15 Apr 2024 |
DOIs | |
Publication status | E-pub ahead of print - 15 Apr 2024 |
Keywords
- Confidence intervals
- Cooperative game theory
- Global sensitivity analysis
- Statistical significance
- Variable importance
ASJC Scopus subject areas
- Discrete Mathematics and Combinatorics
- Statistics and Probability
- Statistics, Probability and Uncertainty