Abstract
We have developed a new class of generative algorithms capable of efficiently learning arbitrary target distributions from possibly scarce, high-dimensional data and subsequently generating new samples. These particle-based generative algorithms are constructed as gradient flows of Lipschitz-regularized Kullback–Leibler or other ƒ-divergences. In this framework, data from a source distribution can be stably transported as particles towards the vicinity of the target distribution. As a notable result in data integration, we demonstrate that the proposed algorithms accurately transport gene expression data points with dimensions exceeding 54K, even though the sample size is typically only in the hundreds.
Original language | English |
---|---|
Pages (from-to) | 1205-1235 |
Number of pages | 31 |
Journal | SIAM Journal on Mathematics of Data Science |
Volume | 6 |
Issue number | 4 |
Early online date | 9 Dec 2024 |
DOIs | |
Publication status | Published - Dec 2024 |
Keywords
- gradient flow
- generative modeling
- information theory
- optimal transport
- particle algorithms
- data integration