AI & ML Efficiency Breakthrough

Near-lossless KV cache compression using angular quantization in the Walsh-Hadamard domain at ~3.5 bits per element.

March 31, 2026

Original Paper

TurboAngle: Near-Lossless KV Cache Compression via Uniform Angle Quantization

Dipkumar Patel

arXiv · 2603.27467

The Takeaway

The method exploits the uniform distribution of angles in the Walsh-Hadamard domain to compress keys and values with minimal perplexity degradation. This allows for significantly longer context windows on hardware with limited VRAM without the usual accuracy trade-offs of scalar quantization.

From the abstract

We compress KV cache entries by quantizing angles in the Fast Walsh-Hadamard domain, where a random diagonal rotation makes consecutive element pairs approximately uniformly distributed on the unit circle. We extend this angular quantizer with per-layer early-boost, which independently configures K and V codebook sizes at each layer, allocating higher precision to a model-specific subset of critical layers. Across seven models (1B to 7B parameters), per-layer early-boost achieves lossless compre

Read the original paper →

← Back to today's papers