Provides a theoretical explanation for why Transformers often fail compared to linear models in financial time series forecasting.
April 2, 2026
Original Paper
Forecast collapse of transformer-based models under squared loss in financial time series
arXiv · 2604.00064
The Takeaway
It demonstrates that under squared loss in low-signal environments like finance, high model expressivity leads to 'forecast collapse' where models generate spurious fluctuations rather than capturing the conditional expectation. This challenges the 'bigger is better' paradigm for financial ML and suggests specific architectural constraints are needed to avoid variance-driven error degradation.
From the abstract
We study trajectory forecasting under squared loss for time series with weak conditional structure, using highly expressive prediction models. Building on the classical characterization of squared-loss risk minimization, we emphasize regimes in which the conditional expectation of future trajectories is effectively degenerate, leading to trivial Bayes-optimal predictors (flat for prices and zero for returns in standard financial settings). In this regime, increased model expressivity does not i