AI & ML Nature Is Weird

AI can be trained to "look" at images exactly like a human does without losing any of its ability to identify what it sees.

April 24, 2026

Original Paper

Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers

arXiv · 2604.20027

The Takeaway

Fine-tuning a Vision Transformer on eye-tracking data makes the model's attention patterns match human gaze. Most people assumed that forcing an AI to act more human would make it less accurate, but this experiment showed zero loss in performance. This creates a new kind of interpretable AI where we can trust that the model is seeing the same important features that we are. It eliminates the black box problem where an AI might get the right answer for the wrong reason. This could be critical for high-stakes fields like medical imaging where we need to know why the AI made a diagnosis. We can now align AI perception with human intuition for free.

From the abstract

For state-of-the-art image understanding, Vision Transformers (ViTs) have become the standard architecture but their processing diverges substantially from human attentional characteristics. We investigate whether this cognitive gap can be shrunk by fine-tuning the self-attention weights of Google's ViT-B/16 on human saliency fixation maps. To isolate the effects of semantically relevant signals from generic human supervision, the tuned model is compared against a shuffled control. Fine-tuning s

Read the original paper →

← Back to today's papers