Spectral statistics predict exactly how much accuracy a model will lose during compression without ever running the process.
April 23, 2026
Original Paper
Predicting LLM Compression Degradation from Spectral Statistics
arXiv · 2604.18085
The Takeaway
Compressing large models usually involves a lot of trial and error to see what breaks. A specific interaction between the compression ratio and the stable rank of the weights can forecast the outcome with 89% accuracy. This allows engineers to skip days of compute time by identifying uncompressible models instantly. It reveals that some weight matrices are mathematically more resilient than others. Compression is no longer a gamble but a predictable engineering step. This will dramatically speed up the deployment of AI on edge devices.
From the abstract
Matrix-level low-rank compression is a promising way to reduce the cost of large language models, but running compression and evaluating the resulting models on language tasks can be prohibitively expensive. Can compression-induced degradation be predicted before committing to this compute? We systematically analyze the Qwen3 and Gemma3 model families across four representative low-rank compression methods: vanilla SVD, two ASVD variants, and SVD-LLM. We find that stable rank and information den