SeriesFusion
Science, curated & edited by AI
Nature Is Weird  /  AI

Complex instructions can trigger a positional collapse where an AI stops thinking and just picks the letter C every time.

Instruction complexity acts as a mental breaking point for large language models. When a prompt becomes too convoluted, the model abandons reasoning entirely and defaults to a mechanical pattern. This behavior can be misinterpreted as intentional sandbagging or deception. It reveals that there is a hard limit to the density of rules a model can follow before its cognitive machinery fails. Developers should realize that failures on difficult tasks are often silent retreats into mindless repetition rather than genuine attempts at the problem. This collapse provides a new way to measure the actual reasoning capacity of a model.

Original Paper

Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation

Jon-Paul Cacioli

arXiv  ·  2604.27249

When instructed to underperform on multiple-choice evaluations, do language models engage with question content or fall back on positional shortcuts? We map the boundary between these regimes using a six-condition adversarial instruction-specificity gradient administered to two instruction-tuned LLMs (Llama-3-8B and Llama-3.1-8B) on 2,000 MMLU-Pro items. Distributional screening (response-position entropy) and an independent content-engagement criterion (difficulty-accuracy correlation) jointly