Releases Feynman, an agentic pipeline and 100k-sample dataset for generating high-quality, knowledge-rich diagrams with grounded captions.
March 16, 2026
Original Paper
Feynman: Knowledge-Infused Diagramming Agent for Scalable Visual Designs
arXiv · 2603.12597
The Takeaway
Visual reasoning data is notoriously hard to scale due to the lack of high-quality image-text alignment in technical domains. This release provides a scalable way to synthesize complex diagrammatic data for training vision-language models.
From the abstract
Visual design is an essential application of state-of-the-art multi-modal AI systems. Improving these systems requires high-quality vision-language data at scale. Despite the abundance of internet image and text data, knowledge-rich and well-aligned image-text pairs are rare. In this paper, we present a scalable diagram generation pipeline built with our agent, Feynman. To create diagrams, Feynman first enumerates domain-specific knowledge components (''ideas'') and performs code planning based