Failed training runs and small experiments account for over $82.2\%$ of the total compute used to develop modern reasoning models.
Public focus usually stays on the massive energy used for the final pre-training of a model. However, the trial and error phase is actually the biggest environmental culprit in AI development. Reasoning models specifically require 17 times more post-training effort than standard instruction-tuned models. This research exposes the hidden dark matter of AI energy consumption. Companies must start reporting their total development costs, not just the final training run, to be honest about their carbon footprint.
The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining
arXiv · 2605.01158
Modern language model development extends far beyond pretraining, yet environmental reporting remains narrowly focused on the cost of training a single final model. In this work, we provide the first detailed breakdown of the environmental impact of a full model development pipeline, from pretraining through supervised fine-tuning, preference optimization, and reinforcement learning, for Olmo 3, a family of 7 billion and 32 billion parameter models in both instruction-following and reasoning var