Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Navneet Garg*, Mathini Sellathurai, Vimal Bhatia, Tharmalingam Ratnarajah

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)
52 Downloads (Pure)

Abstract

Caching popular contents in advance is an important technique to achieve low latency and reduced backhaul congestion in future wireless communication systems. In this article, a multi-cell massive multi-input-multi-output system is considered, where locations of base stations are distributed as a Poisson point process. Assuming probabilistic caching, average success probability (ASP) of the system is derived for a known content popularity (CP) profile, which in practice is time-varying and unknown in advance. Further, modeling CP variations across time as a Markov process, reinforcement Q-learning is employed to learn the optimal content placement strategy to optimize the long-term-discounted ASP and average cache refresh rate. In the Q-learning, the number of Q-updates are large and proportional to the number of states and actions. To reduce the space complexity and update requirements towards scalable Q-learning, two novel (linear and non-linear) function approximations-based Q-learning approaches are proposed, where only a constant (4 and 3 respectively) number of variables need updation, irrespective of the number of states and actions. Convergence of these approximation-based approaches are analyzed. Simulations verify that these approaches converge and successfully learn the similar best content placement, which shows the successful applicability and scalability of the proposed approximated Q-learning schemes.

Original languageEnglish
Pages (from-to)2304-2316
Number of pages13
JournalIEEE Transactions on Communications
Volume69
Issue number4
Early online date28 Dec 2020
DOIs
Publication statusPublished - Apr 2021

Keywords

  • Linear function approximation
  • massive MIMO
  • non-linear function approximation
  • Poisson point process
  • Q-learning, wireless edge caching

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks'. Together they form a unique fingerprint.

Cite this