Theoretical analysis reveals that the efficiency benefits of low-dimensional data structures for diffusion models diminish significantly when the data manifold is non-linear.
March 25, 2026
Original Paper
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
arXiv · 2603.22962
The Takeaway
This challenges the assumption that diffusion models always scale efficiently with intrinsic data dimensionality. It provides a more nuanced understanding of sample complexity for generative models, suggesting that architectural or algorithmic changes are needed to exploit non-linear structures effectively.
From the abstract
We study the theoretical behavior of denoising score matching--the learning task associated to diffusion models--when the data distribution is supported on a low-dimensional manifold and the score is parameterized using a random feature neural network. We derive asymptotically exact expressions for the test, train, and score errors in the high-dimensional limit. Our analysis reveals that, for linear manifolds the sample complexity required to learn the score function scales linearly with the int