AI & ML New Capability

A training-free metacognitive framework that gives LLMs explicit control over expanding, pruning, and repairing reasoning trajectories during inference.

March 31, 2026

Original Paper

CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

Siyuan Ma, Bo Gao, Zikai Xiao, Hailong Wang, Xinlei Yu, Rui Qian, Jiayu Qian, Luqi Gong, Yang Liu

arXiv · 2603.28135

The Takeaway

It moves beyond simple Best-of-N sampling by using a meta-controller to manage search budgets dynamically. Achieving 92.8 on MATH and 48.8 on HLE without retraining suggests a significant step forward in effectively scaling test-time compute.

From the abstract

Recent test-time reasoning methods improve performance by generating more candidate chains or searching over larger reasoning trees, but they typically lack explicit control over when to expand, what to prune, how to repair, and when to abstain. We introduce CoT2-Meta, a training-free metacognitive reasoning framework that combines object-level chain-of-thought generation with meta-level control over partial reasoning trajectories. The framework integrates four components: strategy-conditioned t