We can now prove an AI will work on new data without having to assume that the new data looks like the training data.
April 23, 2026
Original Paper
Separating Geometry from Probability in the Analysis of Generalization
arXiv · 2604.19560
The Takeaway
Generalization bounds can be derived through purely deterministic sensitivity analysis, removing the need for the i.i.d. assumption. This probabilistic bedrock of machine learning theory has always been impossible to verify in the real world. By focusing on the geometry of the optimization problem instead of the probability of the data, we get a much more reliable way to predict model success. It suggests that AI reliability is a structural property of the model itself. This shift allows for the creation of provably safe AI for environments where the future is unpredictable.
From the abstract
The goal of machine learning is to find models that minimize prediction error on data that has not yet been seen. Its operational paradigm assumes access to a dataset $S$ and articulates a scheme for evaluating how well a given model performs on an arbitrary sample. The sample can be $S$ (in which case we speak of ``in-sample'' performance) or some entirely new $S'$ (in which case we speak of ``out-of-sample'' performance). Traditional analysis of generalization assumes that both in- and out-of-