Reveals that self-distillation degrades out-of-distribution reasoning by suppressing 'epistemic verbalization' (the model's expression of uncertainty).
March 26, 2026
Original Paper
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?
arXiv · 2603.24472
The Takeaway
While self-distillation is a standard post-training tool, this paper shows it can lead to a 40% performance drop on unseen problems. It highlights that short, confident reasoning traces (often rewarded in distillation) are actually detrimental to robust problem-solving, suggesting a major change in how we should optimize reasoning models.
From the abstract
Self-distillation has emerged as an effective post-training paradigm for LLMs, often improving performance while shortening reasoning traces. However, in mathematical reasoning, we find that it can reduce response length while degrading performance. We trace this degradation to the suppression of epistemic verbalization - the model's expression of uncertainty during reasoning. Through controlled experiments varying conditioning context richness and task coverage, we show that conditioning the te