TY - JOUR
T1 - Pseudo-model-free hedging for variable annuities via deep reinforcement learning
AU - Chong, Wing Fung
AU - Cui, Haoen
AU - Li, Yuxuan
N1 - Funding Information:
This work was first initiated by the authors at the Illinois Risk Lab in January 2020. This work was presented at the 2020 Actuarial Research Conference in August 2020, the United As One: 24th International Congress on Insurance: Mathematics and Economics in July 2021, the 2021 Actuarial Research Conference in August 2021, Heriot-Watt University in November 2021, University of Amsterdam in June 2022, and the 2022 Insurance Data Science Conference in June 2022. The authors thank the participants for fruitful comments. This work utilizes resources supported by the National Science Foundation’s Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana-Champaign. The authors are grateful to anonymous reviewers for their careful reading and insightful comments. The programming code is publicly available at the GitHub with the following link: https://github.com/yuxuanli-lyx/gmmb_gmdb_rl_hedging .
Publisher Copyright:
© The Author(s), 2023. Published by Cambridge University Press on behalf of Institute and Faculty of Actuaries.
PY - 2023/11
Y1 - 2023/11
N2 - This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.
AB - This paper proposes a two-phase deep reinforcement learning approach, for hedging variable annuity contracts with both GMMB and GMDB riders, which can address model miscalibration in Black-Scholes financial and constant force of mortality actuarial market environments. In the training phase, an infant reinforcement learning agent interacts with a pre-designed training environment, collects sequential anchor-hedging reward signals, and gradually learns how to hedge the contracts. As expected, after a sufficient number of training steps, the trained reinforcement learning agent hedges, in the training environment, equally well as the correct Delta while outperforms misspecified Deltas. In the online learning phase, the trained reinforcement learning agent interacts with the market environment in real time, collects single terminal reward signals, and self-revises its hedging strategy. The hedging performance of the further trained reinforcement learning agent is demonstrated via an illustrative example on a rolling basis to reveal the self-revision capability on the hedging strategy by online learning.
KW - Hedging strategy self-revision
KW - Online learning phase
KW - Sequential anchor-hedging reward signals
KW - Single terminal reward signals
KW - Training phase
KW - Two-phase deep reinforcement learning
KW - Variable annuities hedging
UR - http://www.scopus.com/inward/record.url?scp=85150355424&partnerID=8YFLogxK
U2 - 10.1017/S1748499523000027
DO - 10.1017/S1748499523000027
M3 - Article
AN - SCOPUS:85150355424
SN - 1748-4995
VL - 17
SP - 503
EP - 546
JO - Annals of Actuarial Science
JF - Annals of Actuarial Science
IS - 3
ER -