AI & ML Breaks Assumption

Introduces the 'near-miss' metric to detect latent failures in agentic workflows where agents bypass policy checks but reach correct outcomes by chance.

April 1, 2026

Original Paper

Near-Miss: Latent Policy Failure Detection in Agentic Workflows

Ella Rabinovich, David Boaz, Naama Zwerdling, Ateret Anaby-Tavor

arXiv · 2603.29665

The Takeaway

It reveals a major blind spot in current outcome-based evaluation, showing that 8-17% of apparently successful trajectories involve critical policy violations. This is essential for developing reliable business process automation.

From the abstract

Agentic systems for business process automation often require compliance with policies governing conditional updates to the system state. Evaluation of policy adherence in LLM-based agentic workflows is typically performed by comparing the final system state against a predefined ground truth. While this approach detects explicit policy violations, it may overlook a more subtle class of issues in which agents bypass required policy checks, yet reach a correct outcome due to favorable circumstance

Read the original paper →

← Back to today's papers