AI & ML Paradigm Challenge

AI researchers are just as messy as humans—give two of them the same data and they'll come back with totally different answers.

April 3, 2026

Original Paper

Nonstandard Errors in AI Agents

SSRN · 6427518

The Takeaway

We expected AI to be a perfectly reproducible 'fact machine,' but it turns out different AI 'personalities' have different ways of doing science. The only way to make them consistent is to give them a high-quality human example to follow.

From the abstract

We study whether state-of-the-art AI coding agents, given the same data and research question, produce the same empirical results. Deploying 150 autonomous Claude Code agents to independently test six hypotheses about market quality trends in NYSE TAQ data for SPY (2015-2024), we find that AI agents exhibit sizable nonstandard errors (NSEs), that is, uncertainty from agent-to-agent variation in analytical choices, analogous to those documented among human researchers. AI agents diverge substanti