Thirty-three AI review agents unanimously approved code with broken SQL that a basic automated test caught in seconds.
Development teams often assume that layering multiple AI agents creates a safer environment through consensus. This specific case shows that a committee of agents creates a false sense of security while missing obvious technical defects. The AI committee missed a critical database error because they were focused on the higher-level logic rather than the core syntax. A simple schema-validation script outperformed the entire group of expensive models instantly. Engineering leaders must stop trusting agentic reasoning for tasks that deterministic tools already solved decades ago.
AI-Assisted Development of a Regulated Brokerage Platform: Governance Failures, Incident-Driven Learning, and a Preliminary Taxonomy of AI Failure Modes
SSRN · 6473079
This paper presents a single-case, hypothesis-generating study of AI-assisted software development in a regulated capital markets context. A single senior engineer, using AI coding agents (Claude 4.6 Opus, high-thinking mode, via Cursor IDE) as the primary code generation tool, built a production-intended brokerage Order Management System over 21 active development sessions. The platform spans 23 phases across two milestones, encompassing 151 coarse-grained requirements, 190+ database migrations