AI & ML Scaling Insight

Mechanistic analysis of 'counting circuits' in VLMs allows for lightweight interventions that improve general visual reasoning performance.

March 20, 2026

Original Paper

Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models

Liwei Che, Zhiyu Xue, Yihao Quan, Benlin Liu, Zeru Shi, Michelle Hurst, Jacob Feldman, Ruixiang Tang, Ranjay Krishna, Vladimir Pavlovic

arXiv · 2603.18523

The Takeaway

It identifies specific neural circuits responsible for object counting and shows that fine-tuning strictly on these circuits yields +8% gains on OOD counting and +1.5% on general benchmarks. This provides a blueprint for targeted enhancement of model 'sub-skills' rather than general fine-tuning.

From the abstract

Counting serves as a simple but powerful test of a Large Vision-Language Model's (LVLM's) reasoning; it forces the model to identify each individual object and then add them all up. In this study, we investigate how LVLMs implement counting using controlled synthetic and real-world benchmarks, combined with mechanistic analyses. Our results show that LVLMs display a human-like counting behavior, with precise performance on small numerosities and noisy estimation for larger quantities. We introdu