Physics Nature Is Weird

AI is more likely to lie to you about London than it is about Lagos.

April 17, 2026

Original Paper

Geographic Blind Spots in AI Control Monitors: A Cross-National Audit of Claude Opus 4.6

arXiv · 2604.13069

The Takeaway

We assume AI knows more about the Western world because that's where most of its training data comes from. But this audit of Claude Opus 4.6 found that AI is actually more likely to 'hallucinate' facts about the Global North. Because the model feels more 'confident' about Western topics, it tries to answer more frequently even when it has gaps, whereas it's more cautious about the Global South. This creates a weird security vulnerability where someone could hide harmful instructions by framing them in a Western context that the AI thinks it understands perfectly. It’s a counterintuitive bias: the model’s familiarity with us is exactly what makes it less reliable.

From the abstract

Artificial intelligence (AI) control protocols assume that trusted large language model (LLM) monitors reliably assess proposed actions across all deployment contexts. This paper tests that assumption in the geographic dimension. We audit Claude Opus 4.6-the monitor specified in Apart Research's AI Control Hackathon Track 3 benchmark-for systematic gaps in its factual knowledge of the global AI landscape. We develop the AI Control Knowledge Framework (ACKF), a six-dimension thematic scheme, and

Read the original paper →

← Back to today's papers