AI & ML Paradigm Shift

Introduces neural topology probing to identify causally influential 'hub neurons' in Vision-Language Models that govern cross-modal behavior.

March 31, 2026

Original Paper

Structural Graph Probing of Vision-Language Models

Haoyu He, Yue Zhuo, Yu Zheng, Qi R. Wang

arXiv · 2603.27070

The Takeaway

Moves beyond simple heatmaps or attribution to look at the global co-activation graphs within a model. This enables targeted interventions on specific recurrent neurons to modify model behavior, providing a more tractable scale for VLM interpretability than full circuit recovery.

From the abstract

Vision-language models (VLMs) achieve strong multimodal performance, yet how computation is organized across populations of neurons remains poorly understood. In this work, we study VLMs through the lens of neural topology, representing each layer as a within-layer correlation graph derived from neuron-neuron co-activations. This view allows us to ask whether population-level structure is behaviorally meaningful, how it changes across modalities and depth, and whether it identifies causally infl