AI & ML Efficiency Breakthrough

Truncated backpropagation for video decoding reduces the memory cost of fine-tuning video diffusion models from linear to constant.

March 19, 2026

Original Paper

ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation

Dmitriy Rivkin, Parker Ewen, Lili Gao, Julian Ost, Stefanie Walz, Rasika Kangutkar, Mario Bijelic, Felix Heide

arXiv · 2603.17812

The Takeaway

The 'ChopGrad' scheme allows for pixel-wise loss fine-tuning on long or high-resolution videos that were previously computationally intractable. This enables practitioners to apply high-fidelity losses (like super-resolution or inpainting) to video sequences on standard hardware.

From the abstract

Recent video diffusion models achieve high-quality generation through recurrent frame processing where each frame generation depends on previous frames. However, this recurrent mechanism means that training such models in the pixel domain incurs prohibitive memory costs, as activations accumulate across the entire video sequence. This fundamental limitation also makes fine-tuning these models with pixel-wise losses computationally intractable for long or high-resolution videos. This paper introd

Read the original paper →

← Back to today's papers