AI & ML Nature Is Weird

Increasing LoRA rank by 8x only gives you a 1.68x boost in actual learning capacity—the rest is wasted compute.

April 16, 2026

Original Paper

Rank Utilization in Low-Rank Adaptation: Spectral Diagnostics and Scaling Laws for finetuning Quality

Yassine Yazidi, Hamid Garmani, Mohamed BASLAM

SSRN · 6580122

The Takeaway

We've been scaling Low-Rank Adaptation (LoRA) under the assumption that higher rank equals more learning power. This paper uses spectral diagnostics to show that rank utilization is severely sub-linear. We are hitting diminishing returns much faster than we thought, meaning most practitioners are wasting significant memory and compute on 'empty' adapter capacity. It provides a new set of 'scaling laws' specifically for fine-tuning, allowing developers to optimize their rank settings for real efficiency. This is a direct cost-saving discovery for anyone fine-tuning LLMs at scale. It means you can likely get the same performance with much smaller, faster adapters.

From the abstract

Low-Rank Adaptation (LoRA) allocates a fixed rank $r$ to each adapter matrix, yet how much of that rank is actually used in practice has not been measured. In this work, we quantify this spectral utilization across Qwen2.5-1.5B, LLaMA-3.2-1B, and Phi-2~(2.7B) under seven PEFT configurations and three seeds ($n=63$). Consequently, we find that the mean stable rank scales sub-linearly with nominal rank, $\operatorname{SR}(r)\propto r^{\hat\beta}$ with $\hat\beta\in[0.23,0.30]$ ($R^2>0.99$) across