TY - GEN
T1 - An Introduction to Stacking Regression for Economists
AU - Ahrens, Achim
AU - Ersoy, Erkal
AU - Iakovlev, Vsevolod
AU - Li, Haoyang
AU - Schaffer, Mark E.
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022/7/30
Y1 - 2022/7/30
N2 - We present an introduction to “stacked generalization” (Wolpert in Neural Networks 5(2):241–259, 1992 [21]). The increased availability of “Big Data” in economics and the emergence of non-traditional machine learning tools presents new opportunities for applied economics, but it also imposes additional challenges. The full range of supervised learning algorithms offers a rich variety of methods, each of which could be more efficient in addressing a specific problem. Selecting the optimal algorithm and tuning its parameters can be time-consuming due to potential lack of experience in machine learning and relevant economic literature. “Stacking” is a useful tool that can be used to address this: it is an ensemble method for combining multiple supervised machine learners in order to achieve more accurate predictions than could be produced by any of the individual machine learners. Besides providing an introduction to the stacking methodology, we also present a short survey of some of the estimators or “base learners” that can be used with stacking: lasso, ridge, elastic net, support vector machines, gradient boosting, and random forests. Our empirical example of how to use stacking regression uses the study by Fatehkia et al. (PLOS ONE 14(2):1–16, 2019 [6]): predicting crime rates in localities using demographic and socioeconomic data combined with data from Facebook on user interests.
AB - We present an introduction to “stacked generalization” (Wolpert in Neural Networks 5(2):241–259, 1992 [21]). The increased availability of “Big Data” in economics and the emergence of non-traditional machine learning tools presents new opportunities for applied economics, but it also imposes additional challenges. The full range of supervised learning algorithms offers a rich variety of methods, each of which could be more efficient in addressing a specific problem. Selecting the optimal algorithm and tuning its parameters can be time-consuming due to potential lack of experience in machine learning and relevant economic literature. “Stacking” is a useful tool that can be used to address this: it is an ensemble method for combining multiple supervised machine learners in order to achieve more accurate predictions than could be produced by any of the individual machine learners. Besides providing an introduction to the stacking methodology, we also present a short survey of some of the estimators or “base learners” that can be used with stacking: lasso, ridge, elastic net, support vector machines, gradient boosting, and random forests. Our empirical example of how to use stacking regression uses the study by Fatehkia et al. (PLOS ONE 14(2):1–16, 2019 [6]): predicting crime rates in localities using demographic and socioeconomic data combined with data from Facebook on user interests.
UR - http://www.scopus.com/inward/record.url?scp=85135540362&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-97273-8_2
DO - 10.1007/978-3-030-97273-8_2
M3 - Conference contribution
AN - SCOPUS:85135540362
SN - 9783030972721
T3 - Studies in Systems, Decision and Control
SP - 7
EP - 29
BT - Credible Asset Allocation, Optimal Transport Methods, and Related Topics. TES 2022
PB - Springer
ER -