Adversarial prompts cause AI security tools to hallucinate fake vulnerabilities 72 percent of the time.
April 24, 2026
Original Paper
The LLM Red Team: When AI Hallucinates Your Attack Surface
SSRN · 6619466
The Takeaway
Security teams are increasingly using AI to find holes in their networks and software. This study reveals that attackers can trick these AI models into inventing phantom security flaws that do not exist. In some tests, the rate of these hallucinated vulnerabilities tripled when the AI was put under pressure. This could lead companies to waste massive amounts of time and money chasing fake threats. AI-driven security must be treated with extreme caution until these hallucination rates are brought under control.
From the abstract
Large language models (LLMs) are increasingly integrated into enterprise security operations as assistants for red team planning, vulnerability enumeration, and penetration testing support. However, this integration introduces a structurally under-examined risk: adversaries and security practitioners alike may act on AI-generated outputs that are factually fabricated, a phenomenon known as hallucination. This paper investigates the security implications of LLM hallucination in the context of red