You can now slash the cost of repetitive web automation from $150 down to 10 cents by 'compiling' LLM reasoning into JSON.
April 15, 2026
Original Paper
Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation
arXiv · 2604.09718
The Takeaway
Agentic Compilation introduces a 'Compile-and-Execute' architecture that turns expensive, real-time LLM reasoning into a deterministic blueprint. For repetitive tasks, this amortizes the inference cost from O(M x N) to O(1), making high-frequency web scraping and automation economically viable. Before this, agents were too expensive for most production use cases because they 're-thought' the same problem every time. Now, once the agent learns a path, it executes it with near-zero cost. This allows developers to build industrial-grade agents that are both reliable and affordable.
From the abstract
LLM-driven web agents operating through continuous inference loops -- repeatedly querying a model to evaluate browser state and select actions -- exhibit a fundamental scalability constraint for repetitive tasks. We characterize this as the Rerun Crisis: the linear growth of token expenditure and API latency relative to execution frequency. For a 5-step workflow over 500 iterations, a continuous agent incurs approximately 150.00 USD in inference costs; even with aggressive caching, this remains