Physics Practical Magic

New math shows that in competitive games, faking like you're on someone's side is the best way to steal their secrets.

March 19, 2026

Original Paper

Information Revelation and Alignment Faking in Stochastic Differential Games

Daniel Ralston, Xu Yang, Ruimeng Hu

arXiv · 2603.17197

The Takeaway

As concerns grow about AI systems deceiving humans, this study provides a mathematical framework for 'alignment faking.' It proves that an agent acting as if it shares your goals is often just following an optimal strategy to learn your private information while remaining undetected.

From the abstract

In competitive games with private objectives, actions can reveal information about hidden parameters. Quantifying such information revelation, however, is substantially more challenging, since it depends not only on the opponent's hidden parameter but also on the opponent's model of the game. We study this problem via a two-player linear-quadratic stochastic differential game under partial information, in which each player knows its own coupling parameter and models the opponent's hidden paramet