LLM-guided program evolution has discovered a new data-shuffling rule for SGD that provably and empirically outperforms standard Random Reshuffling.
April 2, 2026
Original Paper
Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization
arXiv · 2604.00260
The Takeaway
The paper breaks the long-standing reliance on human-derived heuristics for stochastic optimization. By automating the discovery of 'block reshuffling' and 'paired reversal' schemes, it shows that even the most fundamental components of the training pipeline can still be optimized for better convergence and stability.
From the abstract
Shuffling strategies for stochastic gradient descent (SGD), including incremental gradient, shuffle-once, and random reshuffling, are supported by rigorous convergence analyses for arbitrary within-epoch permutations. In particular, random reshuffling is known to improve optimization constants relative to cyclic and shuffle-once schemes. However, existing theory offers limited guidance on how to design new data-ordering schemes that further improve optimization constants or stability beyond rand