AI & ML Paradigm Shift

Replaces the heuristic constant momentum (0.9) with a parameter-free, physics-inspired schedule that speeds up convergence by nearly 2x.

April 1, 2026

Original Paper

Beta-Scheduling: Momentum from Critical Damping as a Diagnostic and Correction Tool for Neural Network Training

Ivan Pasichnyk

arXiv · 2603.28921

The Takeaway

It challenges the 60-year convention of fixed momentum in neural network training. By deriving momentum from critical damping, it provides a principled way to accelerate training and a new diagnostic tool for localizing layer-wise failure modes in models.

From the abstract

Standard neural network training uses constant momentum (typically 0.9), a convention dating to 1964 with limited theoretical justification for itsoptimality. We derive a time-varying momentum schedule from the critically damped harmonic oscillator: mu(t) = 1 - 2*sqrt(alpha(t)), where alpha(t) isthe current learning rate. This beta-schedule requires zero free parameters beyond the existing learning rate schedule. On ResNet-18/CIFAR-10,beta-scheduling delivers 1.9x faster convergence to 90% accur

Read the original paper →

← Back to today's papers