AI & ML New Capability

Expands formal reasoning beyond proof construction to the generation and formal verification of counterexamples in Lean 4.

March 23, 2026

Original Paper

Learning to Disprove: Formal Counterexample Generation with Large Language Models

Zenan Li, Zhaoyu Li, Kaiyu Yang, Xiaoxing Ma, Zhendong Su

arXiv · 2603.19514

The Takeaway

Mathematical reasoning in AI has focused almost exclusively on proving true statements; this systemically addresses the 'other half' of the problem—disproving false ones. It uses symbolic mutation to create training data for disproof, a critical step for automated formal verification and synthesis.

From the abstract

Mathematical reasoning demands two critical, complementary skills: constructing rigorous proofs for true statements and discovering counterexamples that disprove false ones. However, current AI efforts in mathematics focus almost exclusively on proof construction, often neglecting the equally important task of finding counterexamples. In this paper, we address this gap by fine-tuning large language models (LLMs) to reason about and generate counterexamples. We formalize this task as formal count