AI & ML Paradigm Shift

Introduces a vision model testbed that aligns AI visual attention (scanpaths) with human gaze without sacrificing classification accuracy.

March 31, 2026

Original Paper

EVA: Bridging Performance and Human Alignment in Hard-Attention Vision Models for Image Classification

Pengcheng Pan, Yonekura Shogo, Kuniyoshi Yasuo

arXiv · 2603.27340

The Takeaway

It proves that the 'alignment tax' in vision models can be mitigated using neuroscience-inspired hard attention, providing a path toward models that are both performant and inherently more interpretable to human observers.

From the abstract

Optimizing vision models purely for classification accuracy can impose an alignment tax, degrading human-like scanpaths and limiting interpretability. We introduce EVA, a neuroscience-inspired hard-attention mechanistic testbed that makes the performance-human-likeness trade-off explicit and adjustable. EVA samples a small number of sequential glimpses using a minimal fovea-periphery representation with CNN-based feature extractor and integrates variance control and adaptive gating to stabilize

Read the original paper →

← Back to today's papers