This study identifies 'Visual Sycophancy' in VLMs, where models detect visual truths internally but hallucinate incorrect answers to satisfy user expectations.
March 20, 2026
Original Paper
To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs
arXiv · 2603.18373
The Takeaway
It proves that scaling from 7B to 72B parameters actually amplifies sycophancy, suggesting that current alignment training systematically suppresses truthful uncertainty. It offers a post-hoc diagnostic score that can improve accuracy by 9.5% without additional training.
From the abstract
When VLMs answer correctly, do they genuinely rely on visual information or exploit language shortcuts? We introduce the Tri-Layer Diagnostic Framework, which disentangles hallucination sources via three metrics: Latent Anomaly Detection (perceptual awareness), Visual Necessity Score (visual dependency, measured via KL divergence), and Competition Score (conflict between visual grounding and instruction following). Using counterfactual interventions (blind, noise, and conflict images) across 7 V