Parallelizes diffusion model sampling across multiple devices using a draft-and-refine process for up to 3.7x speedups.
March 30, 2026
Original Paper
DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
arXiv · 2603.25872
The Takeaway
Most diffusion speedups rely on distillation or fewer steps; this framework allows practitioners to use existing models and simply add more compute/devices to reduce latency in interactive applications without losing quality.
From the abstract
Diffusion models have achieved remarkable success in generating high-fidelity content but suffer from slow, iterative sampling, resulting in high latency that limits their use in interactive applications. We introduce DRiffusion, a parallel sampling framework that parallelizes diffusion inference through a draft-and-refine process. DRiffusion employs skip transitions to generate multiple draft states for future timesteps and computes their corresponding noises in parallel, which are then used in