AI & ML Scaling Insight

Formalizes the 'Observability Gap' to explain why coding agents plateau: humans can only provide feedback on visible outputs, while bugs reside in invisible execution states.

March 31, 2026

Original Paper

The Observability Gap: Why Output-Level Human Feedback Fails for LLM Coding Agents

Yinghao Wang, Cheng Wang

arXiv · 2603.26942

The Takeaway

This finding explains the failure of standard RLHF/feedback loops for complex autonomous agents. It suggests that unless agents expose internal execution logic to the evaluator, feedback will cause failure-mode oscillation rather than convergence, fundamentally limiting the reliability of 'black-box' agentic workflows.

From the abstract

Large language model (LLM) multi-agent coding systems typically fix agent capabilities at design time. We study an alternative setting, earned autonomy, in which a coding agent starts with zero pre-defined functions and incrementally builds a reusable function library through lightweight human feedback on visual output alone. We evaluate this setup in a Blender-based 3D scene generation task requiring both spatial reasoning and programmatic geometric control. Although the agent rediscovered core

Read the original paper →

← Back to today's papers