Nature Is Weird / AI

Confident AI hallucinations leave a physical fingerprint in the loss landscape that can be detected by stressing the model gradients.

The Takeaway

LLM errors that look like facts often reside in sharp minima, meaning the model certainty spikes and crashes when input is slightly changed. Real knowledge is represented by flat facts that remain stable under pressure. By measuring how much a model gradient spikes during slight perturbations, practitioners can catch lies in real time. This geometric signature allows for the detection of stubborn errors without needing a ground-truth database. It shifts the problem of hallucination from a linguistic guessing game to a measurable physical property of the network.

By SeriesFusion Editorial Board · May 5, 2026

Original Paper

From Flat Facts to Sharp Hallucinations: Detecting Stubborn Errors via Gradient Sensitivity

Yee Zhing Liew, Andrew Huey Ping Tan, Anwar P.P Abdul Majeed

arXiv · 2605.00939

From the abstract

Traditional hallucination detection fails on "Stubborn Hallucinations" -- errors where LLMs are confidently wrong. We propose a geometric solution: Embedding-Perturbed Gradient Sensitivity (EPGS). We hypothesize that while robust facts reside in flat minima, stubborn hallucinations sit in sharp minima, supported by brittle memorization. EPGS detects this sharpness by perturbing input embeddings with Gaussian noise and measuring the resulting spike in gradient magnitude. This acts as an efficient

Read the original paper →

← Back to today's papers