AI & ML Efficiency Breakthrough

Leverages human gaze tracking to assign non-uniform token density in diffusion models, creating perceptually perfect images with significantly less compute.

March 25, 2026

Original Paper

Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation

Brian Chao, Lior Yariv, Howard Xiao, Gordon Wetzstein

arXiv · 2603.23491

The Takeaway

By focusing high resolution only on the foveal region, this method drastically reduces the quadratic complexity of token generation. This is a critical breakthrough for the future of real-time high-resolution VR/AR content generation.

From the abstract

Diffusion and flow matching models have unlocked unprecedented capabilities for creative content creation, such as interactive image and streaming video generation. The growing demand for higher resolutions, frame rates, and context lengths, however, makes efficient generation increasingly challenging, as computational complexity grows quadratically with the number of generated tokens. Our work seeks to optimize the efficiency of the generation process in settings where the user's gaze location