MARCH eliminates 'LLM-as-a-judge' confirmation bias by using information asymmetry to force verification agents to check claims without seeing the original response.
March 26, 2026
Original Paper
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
arXiv · 2603.24579
The Takeaway
Standard hallucination checks often fail because the verifier agrees with the generator's plausible-sounding errors. By decomposing responses into atomic claims and checking them in isolation, MARCH allows small 8B models to match the reliability of massive closed-source models in RAG settings.
From the abstract
Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. While existing hallucination detection methods employ LLM-as-a-judge to verify LLM outputs against retrieved evidence, they suffer from inherent confirmation bias, where the verifier inadvertently reproduces the errors of the original generation. To address this, we introduce Multi-Agent Reinforced Self