TY - JOUR
T1 - Kernel generalized least squares regression for network-structured data
AU - Antonian, Edward
AU - Peters, Gareth W.
AU - Chantler, Michael
PY - 2025/5/30
Y1 - 2025/5/30
N2 - In this paper, we study a class of non-parametric regression models for predicting graph signals {yt} as a function of explanatory variables {xt}. Recently, Kernel Graph Regression (KGR) and Gaussian Processes over Graph (GPoG) have emerged as promising techniques for this task. The goal of this paper is to examine several extensions to KGR/GPoG, with the aim of generalising them a wider variety of data scenarios. The first extension we consider is the case of graph signals that have only been partially recorded, meaning a subset of their elements is missing at observation time. Next, we examine the statistical effect of correlated prediction error and propose a method for Generalized Least Squares (GLS) on graphs. In particular, we examine Autoregressive AR(1) vector autoregressive processes, which are commonly found in time-series applications. Finally, we use the Laplace approximation to determine a lower bound for the out-of-sample prediction error and derive a scalable expression for the marginal variance of each prediction. These methods are tested on both real and synthetic data, with the former taken from a network of air quality monitoring stations across California. We find evidence that the generalised GLS-KGR algorithm is well-suited to such time-series applications, outperforming several standard techniques on this dataset.
AB - In this paper, we study a class of non-parametric regression models for predicting graph signals {yt} as a function of explanatory variables {xt}. Recently, Kernel Graph Regression (KGR) and Gaussian Processes over Graph (GPoG) have emerged as promising techniques for this task. The goal of this paper is to examine several extensions to KGR/GPoG, with the aim of generalising them a wider variety of data scenarios. The first extension we consider is the case of graph signals that have only been partially recorded, meaning a subset of their elements is missing at observation time. Next, we examine the statistical effect of correlated prediction error and propose a method for Generalized Least Squares (GLS) on graphs. In particular, we examine Autoregressive AR(1) vector autoregressive processes, which are commonly found in time-series applications. Finally, we use the Laplace approximation to determine a lower bound for the out-of-sample prediction error and derive a scalable expression for the marginal variance of each prediction. These methods are tested on both real and synthetic data, with the former taken from a network of air quality monitoring stations across California. We find evidence that the generalised GLS-KGR algorithm is well-suited to such time-series applications, outperforming several standard techniques on this dataset.
UR - https://www.scopus.com/pages/publications/105006878218
U2 - 10.1371/journal.pone.0324087
DO - 10.1371/journal.pone.0324087
M3 - Article
C2 - 40445910
SN - 1932-6203
VL - 20
JO - PLoS ONE
JF - PLoS ONE
IS - 5
M1 - e0324087
ER -