AI & ML Breaks Assumption

Large Language Models can maintain performance with only 16-64 unique weight values per matrix, as only the relative rank of weights matters.

March 19, 2026

Original Paper

Only relative ranks matter in weight-clustered large language models

Borja Aizpurua, Sukhbinder Singh, Román Orús

arXiv · 2603.17917

The Takeaway

This discovery suggests that precise weight magnitudes are largely irrelevant compared to their ordinal ranking. It provides a training-free path for extreme LLM compression and suggests that current quantization methods may be focusing on the wrong statistical properties.

From the abstract

Large language models (LLMs) contain billions of parameters, yet many exact values are not essential. We show that what matters most is the relative rank of weights-whether one connection is stronger or weaker than another-rather than precise magnitudes. To reduce the number of unique weight values, we apply weight clustering to pretrained models, replacing every weight matrix with K shared values from K-means. For Llama 3.1-8B-Instruct and SmolLM2-135M, reducing each matrix to only 16-64 distin