Nature Is Weird / AI

GPT-4o, Claude, and Gemini consistently fail at the exact same tasks despite being built by entirely different companies.

The Takeaway

These three major language models share an epistemic monoculture that causes them to hallucinate in lockstep. Testing shows that their forecasting errors are highly correlated, creating a single point of failure for anyone relying on AI for diverse opinions. People often assume that using multiple models provides a safety net against bias. This research proves that switching from one model to another offers no protection because they all mirror the same underlying human biases. Relying on an AI consensus could lead to catastrophic blind spots in global decision-making.

By SeriesFusion Editorial Board · May 5, 2026

Original Paper

The Oracle's Fingerprint: Correlated AI Forecasting Errors and the Limits of Bias Transmission

Theodor Spiro

arXiv · 2605.00844

From the abstract

When large language models (LLMs) are consulted as forecasting tools, the independence of individual errors -- the foundation of collective intelligence -- may collapse. We test three conditions necessary for this "epistemic monoculture" to emerge. In Study 1, we show that GPT-4o, Claude, and Gemini exhibit highly correlated forecasting errors on 568 resolved binary prediction questions (mean pairwise error correlation r = 0.77, p < 0.001; r = 0.78 excluding likely-leaked questions), despite bei

Read the original paper →

← Back to today's papers