AI is more likely to lie to you about London than it is about Lagos.
April 17, 2026
Original Paper
Geographic Blind Spots in AI Control Monitors: A Cross-National Audit of Claude Opus 4.6
arXiv · 2604.13069
The Takeaway
We assume AI knows more about the Western world because that's where most of its training data comes from. But this audit of Claude Opus 4.6 found that AI is actually more likely to 'hallucinate' facts about the Global North. Because the model feels more 'confident' about Western topics, it tries to answer more frequently even when it has gaps, whereas it's more cautious about the Global South. This creates a weird security vulnerability where someone could hide harmful instructions by framing them in a Western context that the AI thinks it understands perfectly. It’s a counterintuitive bias: the model’s familiarity with us is exactly what makes it less reliable.
From the abstract
Artificial intelligence (AI) control protocols assume that trusted large language model (LLM) monitors reliably assess proposed actions across all deployment contexts. This paper tests that assumption in the geographic dimension. We audit Claude Opus 4.6-the monitor specified in Apart Research's AI Control Hackathon Track 3 benchmark-for systematic gaps in its factual knowledge of the global AI landscape. We develop the AI Control Knowledge Framework (ACKF), a six-dimension thematic scheme, and