AI & ML Breaks Assumption

Concept erasure in text-to-image models is largely a facade that can be bypassed using text-free inversion attacks.

March 19, 2026

Original Paper

TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models

Qianlong Xiang, Miao Zhang, Haoyu Zhang, Kun Wang, Junhui Hou, Liqiang Nie

arXiv · 2603.17828

The Takeaway

This research demonstrates that current 'concept erasure' safety techniques only sever text-to-image mappings while leaving the underlying visual knowledge intact. This challenges the validity of existing model unlearning benchmarks and necessitates a move toward visual-centric erasure methods.

From the abstract

Although text-to-image diffusion models exhibit remarkable generative power, concept erasure techniques are essential for their safe deployment to prevent the creation of harmful content. This has fostered a dynamic interplay between the development of erasure defenses and the adversarial probes designed to bypass them, and this co-evolution has progressively enhanced the efficacy of erasure methods. However, this adversarial co-evolution has converged on a narrow, text-centric paradigm that equ

Read the original paper →

← Back to today's papers