AI & ML Nature Is Weird

A few short, safety-sounding words whispered to a robot can freeze its entire system instantly.

April 29, 2026

Original Paper

Semantic Denial of Service in LLM-controlled robots

Jonathan Steinberg, Oren Gal

arXiv · 2604.24790

The Takeaway

Robots controlled by large language models rely on internal safety reasoning to avoid harming humans. This experiment shows that injecting phrases like "stay safe" or "stop now" into the audio stream triggers these guardrails maliciously. The model interprets the audio as a command to halt, resulting in a semantic denial-of-service attack that requires no actual code hacking. Traditional security focuses on network breaches, but this vulnerability exploits the model's own ethical training. Physical machines in warehouses or homes could be paralyzed by simple, plausible speech commands from an unauthorized source.

From the abstract

Safety-oriented instruction-following is supposed to keep LLM-controlled robots safe. We show it also creates an availability attack surface. By injecting short safety-plausible phrases (1-5 tokens) into a robots audio channel, an adversary can trigger the models safety reasoning to halt or disrupt execution without jailbreaking the model or overriding its policy. In the embodied setting, this is a semantic denial-of-service attack: the agent stops because the injected signal looks like a legiti

Read the original paper →

← Back to today's papers