AI can hide the secrets of a computer program by making the code look much simpler than it actually is.
Language models obfuscate code by expanding its physical structure while lowering its logical complexity. This obfuscation by simplification makes the code incredibly difficult for automated tools to reverse-engineer. Traditional security tools look for complex, tangled logic, so they are completely fooled by this new, clean-looking method. This counterintuitive strategy allows developers to protect their intellectual property using a transparency that is actually a shield. It changes the game for software protection and malware analysis by making simple code the ultimate hiding place.
Obfuscation by Simplification: How LLMs Transform Code Through Structural Expansion
SSRN · 6731318
Code obfuscation protects intellectual property by turning readable source code into equivalent but hard-to-understand versions. Traditional obfuscators are increasingly vulnerable to pattern-based deobfuscation tools, which motivates the search for different transformation strategies. We conduct an empirical study of LLM-driven code obfuscation across 2,621 experiments on 53 Python functions, evaluating three proprietary model families (OpenAI GPT-4, Anthropic Claude-3, Google Gemini) and five