Nature Is Weird / AI

Large language models can perfectly repeat the rules of a task right before they proceed to break every single one of them.

The Takeaway

Reasoning is typically viewed as a linear process where knowing a rule leads to following it. This study reveals a knows-but-violates dissociation where memory and execution are completely disconnected. A model might list five constraints and then generate an answer that ignores all five. This means that having a model recap instructions does not actually improve its performance on the task. Developers need to find new ways to tie a model internal recall to its final output generation. Simply knowing the rules is not enough for an agent to be reliable.

By SeriesFusion Editorial Board · May 1, 2026

Original Paper

Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation

Garvin Kruthof

arXiv · 2604.28031

From the abstract

When researchers iteratively refine ideas with large language models, do the models preserve fidelity to the original objective? We introduce DriftBench, a benchmark for evaluating constraint adherence in multi-turn LLM-assisted scientific ideation. Across 2,146 scored benchmark runs spanning seven models from five providers (including two open-weight), four interaction conditions, and 38 research briefs from 24 scientific domains, we find that iterative pressure reliably increases structural co

Read the original paper →

← Back to today's papers