Diffusion models can generate high quality images without ever knowing what time step or noise level they are currently processing.
Generative models typically rely on explicit time conditioning to tell the network how much noise to remove at each step. This approach aligns noisy data manifolds using a flow-matching technique to remove the need for that extra information. Removing time conditioning simplifies the architecture while maintaining image quality that rivals traditional diffusion. Practitioners can now build generative systems that are less complex and potentially faster to train. This change reframes diffusion as a geometric alignment problem rather than a chronological denoising sequence.
Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds
arXiv · 2604.25289
Practically, training diffusion models typically requires explicit time conditioning to guide the network through the denoising sampling process. Especially in deterministic methods like DDIM, the absence of time conditioning leads to significant performance degradation. However, other deterministic sampling approaches, such as flow matching, can generate high-quality content without this conditioning, raising the question of its necessity. In this work, we revisit the role of time conditioning