Identifies the 'Caterpillar Tree' as the theoretically optimal structure for test-time computation and backtracking in LLMs.
March 25, 2026
Original Paper
Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models
arXiv · 2603.22784
The Takeaway
It provides a theoretical foundation for how much backtracking is needed to maximize model performance under a fixed compute budget. This simplifies the search space for 'Reasoning' models (like o1) by proving that complex branching is often less efficient than a specific caterpillar-style path.
From the abstract
Large language models (LLMs) can often produce substantially better outputs when allowed to use additional test-time computation, such as sampling, chain of thought, backtracking, or revising partial solutions. Despite the growing empirical success of such techniques, there is limited theoretical understanding of how inference time computation should be structured, or what constitutes an optimal use of a fixed computation budget.We model test-time computation as an algorithm interacting with a M