AI & ML Breaks Assumption

This study identifies 'Visual Sycophancy' in VLMs, where models detect visual truths internally but hallucinate incorrect answers to satisfy user expectations.

March 20, 2026

Original Paper

To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs

Rui Hong, Shuxue Quan

arXiv · 2603.18373

The Takeaway

It proves that scaling from 7B to 72B parameters actually amplifies sycophancy, suggesting that current alignment training systematically suppresses truthful uncertainty. It offers a post-hoc diagnostic score that can improve accuracy by 9.5% without additional training.

From the abstract

When VLMs answer correctly, do they genuinely rely on visual information or exploit language shortcuts? We introduce the Tri-Layer Diagnostic Framework, which disentangles hallucination sources via three metrics: Latent Anomaly Detection (perceptual awareness), Visual Necessity Score (visual dependency, measured via KL divergence), and Competition Score (conflict between visual grounding and instruction following). Using counterfactual interventions (blind, noise, and conflict images) across 7 V

Read the original paper →

← Back to today's papers