Introduces neural topology probing to identify causally influential 'hub neurons' in Vision-Language Models that govern cross-modal behavior.
March 31, 2026
Original Paper
Structural Graph Probing of Vision-Language Models
arXiv · 2603.27070
The Takeaway
Moves beyond simple heatmaps or attribution to look at the global co-activation graphs within a model. This enables targeted interventions on specific recurrent neurons to modify model behavior, providing a more tractable scale for VLM interpretability than full circuit recovery.
From the abstract
Vision-language models (VLMs) achieve strong multimodal performance, yet how computation is organized across populations of neurons remains poorly understood. In this work, we study VLMs through the lens of neural topology, representing each layer as a within-layer correlation graph derived from neuron-neuron co-activations. This view allows us to ask whether population-level structure is behaviorally meaningful, how it changes across modalities and depth, and whether it identifies causally infl