AI & ML Efficiency Breakthrough

VGS-Decoding is a training-free method to mitigate medical VLM hallucinations by reweighting token probabilities based on their visual dependency.

March 24, 2026

Original Paper

VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

Govinda Kolli, Adinath Madhavrao Dukre, Behzad Bozorgtabar, Dwarikanath Mahapatra, Imran Razzak

arXiv · 2603.20314

The Takeaway

Hallucination in clinical settings is often driven by language priors over-riding visual evidence. This method uses a 'Visual Grounding Score' to amplify grounded tokens and suppress hallucinations during inference, offering nearly 9% gains in recall without any additional training cost.

From the abstract

Medical Vision-Language Models (VLMs) often hallucinate by generating responses based on language priors rather than visual evidence, posing risks in clinical applications. We propose Visual Grounding Score Guided Decoding (VGS-Decoding), a training-free method to mitigate hallucinations during inference. Our key insight is that hallucinated tokens maintain or increase their probability when visual information is degraded, while visually grounded tokens decrease in probability. We introduce the