Achieves a major breakthrough in dataset distillation, reaching 60% accuracy on ImageNet-1K using only a handful of synthetic images.
April 2, 2026
Original Paper
Learnability-Guided Diffusion for Dataset Distillation
arXiv · 2604.00519
The Takeaway
Dataset distillation has historically failed to scale to complex datasets like ImageNet. LGD reduces redundancy by 39% and creates an adaptive curriculum, making it possible to train models on tiny, hyper-efficient synthetic datasets without massive performance loss.
From the abstract
Training machine learning models on massive datasets is expensive and time-consuming. Dataset distillation addresses this by creating a small synthetic dataset that achieves the same performance as the full dataset. Recent methods use diffusion models to generate distilled data, either by promoting diversity or matching training gradients. However, existing approaches produce redundant training signals, where samples convey overlapping information. Empirically, disjoint subsets of distilled data