AI & ML Efficiency Breakthrough

A 4B parameter model matches a 120B parameter model in program verification through a rigorous data curation pipeline.

March 17, 2026

Original Paper

Not All Invariants Are Equal: Curating Training Data to Accelerate Program Verification with SLMs

Ido Pinto, Yizhak Yisrael Elboher, Haoze Wu, Nina Narodytska, Guy Katz

arXiv · 2603.15510

The Takeaway

Demonstrates that high-quality semantic rewriting and AST-based normalization can allow Small Language Models (SLMs) to reach the performance of models 30x their size. This provides a blueprint for specialized reasoning tasks where compute efficiency is critical.

From the abstract

The synthesis of inductive loop invariants is a critical bottleneck in automated program verification. While Large Language Models (LLMs) show promise in mitigating this issue, they often fail on hard instances, generating invariants that are invalid or computationally ineffective. While fine-tuning is a natural route to mitigate this limitation, obtaining high-quality training data for invariant generation remains an open challenge. We present a rigorous data curation pipeline designed to extra