Introduces the largest-ever multi-modal CAD dataset with 10 million annotations for 1 million models to enable geometric deep learning on BRep data.
March 16, 2026
Original Paper
A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering
arXiv · 2603.12605
The Takeaway
At 5 terabytes, this dataset democratizes the ability to train foundation models for industrial reverse engineering and CAD modeling, a field previously limited by the lack of large-scale structured 3D data.
From the abstract
Reverse engineering and rapid prototyping of computer-aided design (CAD) models from 3D scans, sketches, or simple text prompts are vital in industrial product design. However, recent advances in geometric deep learning techniques lack a multi-modal understanding of parametric CAD features stored in their boundary representation (BRep). This study presents the largest compilation of 10 million multi-modal annotations and metadata for 1 million ABC CAD models, namely A2Z, to unlock an unprecedent