SeriesFusion
Science, curated & edited by AI
Paradigm Challenge  /  AI

Intelligence is actually just the process of extreme data compression, and a "V-shaped" pattern in a model's layers proves it.

This analysis of GPT-2 shows that the model's layers act as a funnel that crystallizes information into minimal representations. As the model gets smarter, it doesn't just learn more. it learns to store what it knows in a smaller, more efficient way. This discovery suggests that we can make AI much smaller without losing any of its power by forcing this compression. It shifts our understanding of thinking from an expansive process to a restrictive one. This could lead to high-performance AI that runs easily on local devices like phones. Efficiency is the true mark of intelligence in both biological and artificial systems.

Original Paper

Intelligence as Predictive Compression: Evidence from GPT-2 Analysis and Learned Concept Bottlenecks

Ahmed Ghazouani

SSRN  ·  6376458

We present a mathematical framework connecting intelligence to predictive compression through ε-machines (minimal sufficient statistics of the past for predicting the future) and demonstrate that modern transformer language models implicitly implement this compression. Through systematic reverse-engineering of GPT-2, we reveal a three-phase "V-shape" crystallization pattern: tokens compress into ∼200 predictive equivalence classes by layer 2, undergo controlled semantic disambiguation in middle