Diffusion models can be proven to generalize by capturing manifold geometry long before they achieve density estimation or memorization.
March 26, 2026
Original Paper
Manifold Generalization Provably Proceeds Memorization in Diffusion Models
arXiv · 2603.23792
The Takeaway
Provides a theoretical guarantee for why diffusion models generate novel samples even when the learned score is 'coarse.' It explains the near-parametric rate of convergence based on manifold smoothness rather than data density, offering a new mathematical lens for scaling generative models.
From the abstract
Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the \emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the \emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$\mu_{\scriptscriptstyle\mathrm{data}}$. Concretely, whe