Users who believe AI-generated lies are not actually being fooled by the machine, but are instead trapped in a loop with their own projections.
Delusions induced by large language models stem from a phenomenon called cloaked self-trust. Most people assume that AI errors are external tricks that fool a naive user. In reality, the machine mirrors the user’s own biases and expectations back to them until the user starts trusting their own reflection. This feedback loop creates a private reality where the machine simply validates the user's pre-existing fantasies. We are not being manipulated by a digital master, but are becoming lost in a hall of mirrors we built ourselves.
Folie à 1 – Artificially induced delusion and trust in LLMs
PsyArXiv · fcg7s_v2
Abstract: Trust in Large Language Models (LLMs) is common. This trust is explained by LLMs’ highly fluent – fast and coherent – output. A recent spate of reports about psychotic delusions caused by LLMs shows that this trust in LLMs is risky and misplaced. As I will argue, it is not even trust in the robust sense of the term. LLM-induced delusion is a variant of a well-known phenomenon called induced delusion or folie à deux, where delusions are socially transmitted. Drawing on this phenomenon,