AI & ML Nature Is Weird

When you shrink an AI to fit on a phone, it doesn’t just get slower—it gets weirdly cocky about things it’s wrong about and shy about things it actually knows.

April 13, 2026

Original Paper

Quantisation Reshapes the Metacognitive Geometry of Language Models

Jon-Paul Cacioli

arXiv · 2604.08976

The Takeaway

This reveals that 'quantization' reshuffles the model’s internal self-awareness in unpredictable ways. Developers can no longer assume a smaller model is just a 'lower-resolution' version of the original; its very sense of certainty has been rearranged.

From the abstract

We report that model quantisation restructures domain-level metacognitive efficiency in LLMs rather than degrading it uniformly. Evaluating Llama-3-8B-Instruct on the same 3,000 questions at Q5_K_M and f16 precision, we find that M-ratio profiles across four knowledge domains are uncorrelated between formats (Spearman rho = 0.00). Arts & Literature moves from worst-monitored (M-ratio = 0.606 at Q5_K_M) to best-monitored (1.542 at f16). Geography moves from well-monitored (1.210) to under-monitor