AI & ML Paradigm Shift

Latent representations of reasoning survive cross-architecture translation, allowing student models to inherit teacher capabilities without training.

March 24, 2026

Original Paper

Thinking in Different Spaces: Domain-Specific Latent Geometry Survives Cross-Architecture Translation

Marcus Armstrong, Navid Ayoobi, Arjun Mukherjee

arXiv · 2603.20406

The Takeaway

The paper shows that a simple linear projection can map activations from a teacher to a student model well enough to correct reasoning behavior at inference time. This suggests model 'transplants' and cross-architecture interventions are feasible without expensive fine-tuning or weight updates.

From the abstract

We investigate whether independently trained language models converge to geometrically compatible latent representations, and whether this compatibility can be exploited to correct model behavior at inference time without any weight updates. We learn a linear projection matrix that maps activation vectors from a large teacher model into the coordinate system of a smaller student model, then intervene on the student's residual stream during generation by substituting its internal state with the t