Top-tier AI models talk like absolute geniuses, but they lose their shirts the second you ask them to bet real money on the news.
April 10, 2026
Original Paper
Prediction Arena: Benchmarking AI Models on Real-World Prediction Markets
arXiv · 2604.07355
The Takeaway
Despite high scores on reasoning benchmarks, frontier models lost up to 30% of their capital on actual prediction markets. This reveals a massive 'reality gap' where linguistic intelligence fails to translate into the nuanced judgment required to beat human financial markets.
From the abstract
We introduce Prediction Arena, a benchmark for evaluating AI models' predictive accuracy and decision-making by enabling them to trade autonomously on live prediction markets with real capital. Unlike synthetic benchmarks, Prediction Arena tests models in environments where trades execute on actual exchanges (Kalshi and Polymarket), providing objective ground truth that cannot be gamed or overfitted. Each model operates as an independent agent starting with $10,000, making autonomous decisions e