A built-in moral filter in the human brain makes people ignore bad advice from an AI while remaining highly susceptible to its good advice.
April 25, 2026
Original Paper
On the Limits of Moral Surrender to AI
SSRN · 6622458
The Takeaway
Humans possess a unique moral filter when interacting with artificial intelligence that does not apply to interactions with other people. Peer pressure from humans can easily lead individuals toward antisocial or harmful behavior. This study shows that the same person who might be swayed by a human bully will reject the exact same advice coming from a machine. This inversion makes AI an incredibly safe and effective tool for promoting positive behavior. We can use technology to nudge society toward better outcomes without the usual risk of accidentally spreading toxic trends.
From the abstract
<div> <div> If people surrender to AI on cognitive tasks, do they also surrender morally? In a preregistered, <span>incentive-compatible, between-subjects experiment (N ≈ 600), participants receive </span><span>prosocial or antisocial behavioral advice attributed to an AI system, to a human source, </span><span>or to no source, and then make consequential choices in three paradigms: a non-strategic </span><span>other-regarding allocation, a strategic cooperation decision, and a private (dis)hone