AI & ML Efficiency Breakthrough

Introduces a learnable bridge between GELU and ReLU activations to enable deployment-friendly piecewise-linear networks.

March 24, 2026

Original Paper

λ-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks

Cristian Pérez-Corral, Alberto Fernández-Hernández, Jose I. Mestre, Manuel F. Dolz, Enrique S. Quintana-Ortí

arXiv · 2603.21991

The Takeaway

Deployment toolchains often favor ReLU for speed and quantization, but researchers prefer the smooth training of GELU. λ-GELU allows models to be trained smoothly and then progressively 'hardened' into ReLUs post-training with minimal performance loss, bridging the gap between high-performance training and hardware-optimized inference.

From the abstract

Gaussian Error Linear Unit (GELU) is a widely used smooth alternative to Rectifier Linear Unit (ReLU), yet many deployment, compression, and analysis toolchains are most naturally expressed for piecewise-linear (ReLU-type) networks. We study a hardness-parameterized formulation of GELU, f(x;{\lambda})=x{\Phi}({\lambda} x), where {\Phi} is the Gaussian CDF and {\lambda} \in [1, infty) controls gate sharpness, with the goal of turning smooth gated training into a controlled path toward ReLU-compat

Read the original paper →

← Back to today's papers